Experience
Lead Data Engineer
Spearheaded the redesign of legacy data pipelines using Meltano, Scrapy, and Airflow, resulting in a [...]% reduction in infrastructure maintenance time and a [...]% improvement in data reliability.
Architected a custom data extraction solution for Slack to bypass 2FA and high-cost API limitations, saving the company 200k USD per year in operational costs.
Developed a robust data pipeline for SMA Resume collection, implementing request-bypass logic that successfully processed [...] resumes for the recruitment team without hitting service limits.
Integrated the Expert App and Grading Service databases into a centralized BigQuery warehouse, enabling 50+ internal stakeholders to access real-time performance metrics via dbt, improving training performance by 10% and saving the company more than 100k USD per year in operational costs.
Standardized production environments by implementing Infrastructure-as-Code (Terraform) and CI/CD (GitHub Actions), which reduced deployment failure rates.
Staff Data Engineer
Led the construction of an end-to-end data pipeline using dbt and Snowflake, consolidating ~100s disparate data sources into a single source of truth for university evaluations.
Orchestrated complex workflows using Airflow and Kubernetes, supporting a network of 5 context-rich websites (100k+ monthly active users) and improving their data-refresh frequency by quarterly to daily.
Improved Google organic ranking for primary web assets by implementing a dimensionally modeled data structure that served context-rich metadata via a TypeScript GraphQL API.
Lead Data Engineer
Restructured AWS cloud infrastructure and security using Terraform for more than 10k active sensors, preventing an estimated 48 hours of potential downtime.
Decoupled Python and NodeJS microservices using Apache Kafka (AWS MSK), enabling real-time processing of millions of events per second for predictive maintenance ML models.
Data Engineer
Optimized an e-commerce recommendation pipeline using Databricks (Scala/Spark) for Casas Bahia (a major Brazilian retailer), resulting in a 1.8% increase in client revenue, totaling millions of BRL.
Engineered a recommendation platform for a global shrimp producer using Kedro, Airflow and Bigquery in GCP, increasing average product size by 40% while reducing feed costs by 20%.
Resolved critical production data issues for an industrial vessel fleet from Japan, ensuring 100% on-time delivery of data reports and avoiding potential financial penalties of thousands of USD.
Data Specialist
Constructed a scalable BigQuery Data Lake, reducing average query execution times from several days to minutes for the finance department.
Migrated a legacy SQL Server warehouse to Snowflake, moving more than 1 TB of financial data while applying Kimball principles to improve cross-departmental reporting speed.
Data Engineer
Engineered a serverless data acquisition infrastructure on AWS (Lambda, SQS, S3) using Infrastructure-as-Code (CloudFormation), supporting the secure ingestion of more than 100k records daily.
Refactored high-availability Python (Scrapy) web scrapers, increasing data collection success rates by 30% across thousands of distinct public and private data sources.
Software Developer Intern / DBA
Optimized MS SQL Server stored procedures and migrated on-premise databases to Azure SQL Elastic Pools, resulting in a 20% improvement in application responsiveness and 0 downtimes.
Developed a remote investor portal using VB.NET (ASP.NET MVC 5) and Entity Framework, providing secure data access to more than 200 external stakeholders.
Software Engineer Intern
Developed Machine Learning algorithms for sentiment analysis and e-commerce search using Java, improving search relevance for user-generated content.
Automated social media monitoring workflows for the Odysci Media Analyzer platform, reducing manual data processing time by hours per week.
