Kubernetes · Docker · Terraform · AWS · Azure · GCP · Kubernetes · Docker · Terraform · AWS · Azure · GCP · Kubernetes · Docker · Terraform · AWS · Azure · GCP · Kubernetes · Docker · Terraform · AWS · Azure · GCP ·
Apache Spark · Kafka · Flink · Airflow · Databricks · Apache Spark · Kafka · Flink · Airflow · Databricks · Apache Spark · Kafka · Flink · Airflow · Databricks · Apache Spark · Kafka · Flink · Airflow · Databricks ·
PostgreSQL · Redis · Snowflake · Neo4j · MongoDB · S3 · PostgreSQL · Redis · Snowflake · Neo4j · MongoDB · S3 · PostgreSQL · Redis · Snowflake · Neo4j · MongoDB · S3 · PostgreSQL · Redis · Snowflake · Neo4j · MongoDB · S3 ·
Machine Learning · Deep Learning · NLP · LLM · RAG · Machine Learning · Deep Learning · NLP · LLM · RAG · Machine Learning · Deep Learning · NLP · LLM · RAG · Machine Learning · Deep Learning · NLP · LLM · RAG ·
Python · Scala · SQL · Bash · FastAPI · React · Python · Scala · SQL · Bash · FastAPI · React · Python · Scala · SQL · Bash · FastAPI · React · Python · Scala · SQL · Bash · FastAPI · React ·
GenAI · RAG · Vector Embeddings · GenAI · RAG · Vector Embeddings · GenAI · RAG · Vector Embeddings · GenAI · RAG · Vector Embeddings ·
Data Pipeline · ETL · Streaming · Real-Time · Data Pipeline · ETL · Streaming · Real-Time · Data Pipeline · ETL · Streaming · Real-Time · Data Pipeline · ETL · Streaming · Real-Time ·
CI/CD · DevOps · MLOps · DataOps · Microservices · gRPC · CI/CD · DevOps · MLOps · DataOps · Microservices · gRPC · CI/CD · DevOps · MLOps · DataOps · Microservices · gRPC · CI/CD · DevOps · MLOps · DataOps · Microservices · gRPC ·
Anwar Knyane

Anwar Knyane

Software Engineer — Cloud · Data & AI · Databricks
📍 Buenos Aires, Argentina

Building data platforms
that scale

Highly motivated Lead Data Engineer and Solution Architect with over 10 years of experience in software development and big data ecosystems. Proven track record of pioneering methodologies, optimizing software engineering processes, and crafting innovative solutions in FinTech, e-commerce, and venture capital industries. Adept at leading cross-functional teams and leveraging advanced technologies to drive business growth and operational efficiency.

Skills & Technologies

☁️

Cloud & Infra

AWS · Azure · Terraform · Kubernetes · Docker

Data Engineering

Spark · Kafka · Flink · Airflow · Databricks

🧠

AI / ML

LLM · RAG · NLP · GenAI

🗄️

Databases

PostgreSQL · Snowflake · Redis · Neo4j · S3 · Delta Lake · Unity Catalog

💻

Languages

Python · Scala · SQL · Bash

🔄

DevOps / MLOps

CI/CD · DataOps · Monitoring

Professional Experience

APR 2026 — PRESENT

Vice President - Lead Software Engineer

JPMorganChase
  • Lead initiatives to improve enterprise data quality, consistency, and governance across distributed data platforms, reducing reporting discrepancies and increasing reliability of business-critical datasets.
  • Design and optimize scalable ETL and analytics solutions using Databricks and AWS to support regulatory, operational, and business intelligence workloads.
  • Manage release planning and deployment processes for data platform enhancements, ensuring smooth production rollouts, platform stability, and compliance with enterprise change management standards.
  • Implement AI-driven solutions leveraging Generative AI, LLMs, and predictive analytics within Databricks and AWS to improve operational efficiency, data validation, and internal knowledge discovery.
  • Collaborate with cross-functional teams including engineering, product, risk, and operations to deliver scalable data solutions aligned with business and regulatory requirements.
  • Develop automated data quality monitoring frameworks, testing and validation rules, and observability dashboards, significantly reducing manual reconciliation efforts and production incidents.
  • Contribute to modernization of enterprise data architecture and AI adoption strategy by evaluating and implementing advanced Databricks, AWS, and AI-assisted analytics solutions.
  • Improve CI/CD and release processes for data and AI applications, accelerating delivery cycles while maintaining high standards for testing, security, and reliability.
MAR 2025 — AUG 2025

Senior Data Engineer

Mobile Computing — A Grid Dynamics Company
  • Designed and deployed large-scale cloud-native data platform on AWS using Terraform, ensuring high availability and scalability.
  • Built real-time data pipelines with Apache Kafka and Apache Flink to process IoT device telemetry and event streams at scale, alongside batch ETL with Apache Spark (Scala/Python).
  • Orchestrated workflows via Apache Airflow, integrating Snowflake, PostgreSQL, and Redis for analytics and caching.
  • Delivered interactive dashboards with Apache Superset, enabling self-service BI and real-time monitoring of IoT metrics.
  • Developed backend microservices with FastAPI, providing secure access to ML models and business data.
  • Leveraged Neo4j for graph-based analytics to model device relationships and dependencies across IoT ecosystems.
  • Implemented DevOps practices for cost optimization, monitoring, and reliability improvements (~30% reduction in infrastructure costs).
  • Mentored engineers and led best practices in distributed systems, cloud engineering, IoT data management, and data platform architecture.
FEB 2023 — FEB 2025

Lead Software Engineer / Solutions Architect (CTO)

Wale.ai
  • Responsible for Data Platform at Wale.ai, a data & analytics service for the VC/PE ecosystem using advanced AI architecture (ML, GenAI - LLM, RAG, Vector Store) to revolutionize startup sourcing.
  • Designed Data Platform architecture and implemented efficient data pipelines ensuring data quality and consistency for predictive insights (Apache Spark, Databricks, Postgres, REST, Azure Synapse, Azure Data Factory).
  • Integrated machine learning models (NLP, LLM, DL) into data pipelines for predictive insights and set up DevOps/MLOps processes in place.
  • Tested and supported implementation of cutting-edge LLM tech, including RAG and multimodal approach (text, speech/voice, image): OpenAI GPT-4 turbo, Whisper, Mistral/Mixtral, Llama.
  • Optimized cloud costs and ETL processes, achieving ~$100k USD in savings per year on computing and API costs.
  • Acted as deputy CTO and supervised a team of 3 web developers, leading to successful launch of product beta version.
  • Project was featured on the Data-driven Venture Capital (DDVC 2023) report among the top 20 leaders in driving innovation in VC.
MAY 2021 — JAN 2023

Lead Data Engineer / Solutions Architect

Sberbank
  • Led a team of 3 data engineers and managed multiple service providers, overseeing data platforms of e-commerce marketplaces and delivery startups with a GMV exceeding $2 billion.
  • Designed and led implementation of a robust cloud data platform architecture supporting real-time and streaming data pipelines, integrating data landscapes of 5 different business entities for 300k+ daily orders (Apache Airflow, REST, PostgreSQL, Apache Spark, S3).
  • Implemented DataOps principles, enhancing data operations and development processes (go-to-market speed for new data products increased by 50%, production errors decreased by 15%).
  • Supported the implementation of AI services, such as courier routing optimization and product card matching, leading to more than $20M USD in savings and revenue increase.
JUL 2019 — APR 2021

Lead Data Engineer

VTB Bank
  • Supervised two data engineering teams, totaling 30 members, as a cross-functional team leader on mission-critical big data projects utilizing Apache Airflow, Apache Spark, Y. Cloud, Docker, Kubernetes, and S3.
  • Automated deployment processes for key big data technologies, significantly enhancing operational efficiency.
  • Authored comprehensive software documentation to facilitate knowledge transfer and streamline system maintenance, leading to a 50% reduction in system crashes.
  • Engaged in R&D initiatives during solution selection and architectural design phases, contributing to innovative technological solutions.
  • Provided expert consultation and support to cross-functional teams, promoting collaborative problem-solving and knowledge sharing.
  • Automated data uploads, resulting in a 30% reduction in risks associated with mortgages and loans, while improving the accuracy of machine learning models by 300%.
DEC 2016 — MAY 2019

Senior Data Engineer

Sberbank-Technology
  • Contributed to the design and development of a cutting-edge streaming infrastructure (Apache Kafka, Apache Flink, Scala, Bash, ELK).
  • Ensured robust functionality through hands-on implementation and support.
  • Contributed to maintaining high code quality by writing and maintaining unit tests.
  • Actively participated in code review processes to uphold best practices and enhance collaborative development.
  • Optimized Flink CEP workflows, significantly improving data throughput and reducing latency from approximately 5 seconds to around 200 milliseconds.

Education

Moscow Institute of Physics and Technology (MIPT)

Master's — Applied Mathematics and Computer Science
2016 — 2018

Novgorod State University (NovSU)

Bachelor's — Nanotechnology
2012 — 2016

Languages

🇸🇦 Arabic

— Native

🇷🇺 Russian

— Native

🇫🇷 French

— Native

🇬🇧 English

— Professional

🇪🇸 Spanish

— Elementary

Let's connect