佩德罗·恩里克·罗查·梅,美国佛罗里达州迈阿密的开发人员
Pedro is available for hire
Hire Pedro

佩德罗·恩里克·罗查·梅

Verified Expert  in Engineering

机器学习开发人员

Location
迈阿密,佛罗里达州,美国
至今成员总数
April 25, 2019

Pedro is a business-oriented seasoned data scientist and data engineer with experience building and deploying production distributed data pipelines and machine learning models at scale, 涵盖从设计开始的整个数据生命周期, construction, optimization, deployment, 以及数据架构和机器学习模型的监控. Pedro's focus is to deliver solutions that are robust to changes in environment and data and flexible to address changes in business requirements.

Portfolio

Rocha Moy贸易公司
Python, Julia, 亚马逊网络服务(AWS), Options Trading, APIs, Web Scraping...
Self-employed
Scikit-learn, SpaCy, GPT,自然语言处理...
Toptal Client
Python, Amazon Elastic MapReduce (EMR), Spark, Snowflake

Experience

Availability

Full-time

首选的环境

Python, Scala, 亚马逊网络服务(AWS), 工程数据, Data Science, 机器学习, Big Data, 软件架构

The most amazing...

...我建立的系统是算法和概率交易系统. With a limited view of the world, probabilities are essential tools in risk management.

Work Experience

Chief Architect

2017 - PRESENT
Rocha Moy贸易公司
  • Developed the API for probabilistic and algorithmic options trading with Interactive Brokers and TD Ameritrade. 专长包括数据集成, task automation, 投资组合模拟, risk mitigation, 策略验证.
  • 集成了许多不同的数据源,从api到网页抓取.
  • Automated trade execution, scheduling of trades, and release of funds for trading completely.
技术:Python, Julia, 亚马逊网络服务(AWS), Options Trading, APIs, Web Scraping, 概率论, 机器学习, Simulations, 数据集成

首席数据科学家

2021 - 2022
Self-employed
  • Designed, implemented, and deployed different natural language processing models.
  • 与涉众一起工作以理解用例, 产品开发的途径, 以及使用已部署模型的实现.
  • 指导和支持团队中的初级数据科学家.
技术:Scikit-learn, SpaCy, 自然语言处理(NLP), 生成预训练变压器(GPT), GPT, Neural Networks, XGBoost

企业首席数据架构师-承包商

2020 - 2022
Toptal Client
  • 处理架构, development, and automation of distributed computing pipelines and data storage in the cloud for the enterprise.
  • Automated scalable infrastructure in the cloud to respond to development and consumer demand.
  • Co-managed and supervised a team of engineers from designing and delegating tasks, mentoring, 监督工作.
技术:Python, Amazon Elastic MapReduce (EMR), Spark, Snowflake

企业高级ETL和数据工程师-承包商

2019 - 2020
Toptal Client
  • Designed, implemented, and deployed to production fully-fledged distributed ETL jobs in Spark/Scala API.
  • 处理各种数据源和数据汇,包括绝望文件, Hive tables, Mongo集合, 和Kafka代理.
  • Served as the senior engineer and tech lead of the team strengthening engineering and development processes, 改进软件质量控制, 并帮助设计sprint的故事.
Technologies: Oracle SQL, DocumentDB, Scala, Python, MongoDB, Spark SQL, Spark, Apache Kafka, Hadoop

Hadoop大气科学项目的概念证明-承包商

2019 - 2020
Toptal Client
  • Built cluster from scratch adhering to client's needs to work with home cluster.
  • Designed and implemented generic and specific data architectures meeting the client's query's complexity and performance needs.
  • Built PySpark and Python software layers of abstraction to allow the client to build on top of the current infrastructure.
技术:PySpark, Hadoop

研究数据工程师

2018 - 2019
尼克劳斯儿童医院
  • 为R用户开发现有的分析和数据工作流程, Python, 和英帕拉建立最佳工程实践.
  • 提供临时和系统地开发ETL和大数据管道, validation, 以及不同数据源的集成.
  • Liaised for the research department to IT and BI departments providing guidance and expertise on analytical and data needs.
技术:Impala, Hadoop, Spark, Scala, Python

技术顾问

2018 - 2018
Insight数据科学
  • Worked with fellows and their data engineering projects on problem definition, 系统架构, and execution.
  • Advised on technologies such as Spark, Kafka, Redis, HBase, Cassandra, and PostgreSQL.
  • Conducted mock interviews with fellows on scalability concepts, algorithms, and CS fundamentals.
技术:PostgreSQL, Cassandra, HBase, Redis, Apache Kafka, Spark

高级软件工程师

2016 - 2017
NexHealth
  • Developed and deployed software to the client's site to perform data collection and server sync.
  • Performed both database and web-based data integrations of electronic medical records back to NexHealth servers.
  • Developed a smart SMS response system allowing the user to interact with NexHealth products via SMS.
Technologies: Redis, PostgreSQL, Apache Spark, JavaScript, Scala, Python, Ruby on Rails (RoR)

Data Scientist

2016 - 2016
QuaEra Insights
  • Served as the lead data scientist in a consulting project overseeing data management and modeling strategy.
  • Used natural language processing to transform unstructured data into features and extract business intelligence.
  • Built a recommendation engine as business rules potentially yielding savings on up to 50% of the business.
技术:Python

数据工程研究员

2015 - 2015
Insight数据科学
  • 建造了比赛中场管, 该平台旨在发现YouTube上对全球品牌有影响力的人.
  • Deployed Amazon’s EMR Spark with HBase processing and ingesting billions of data tuples.
  • 在多达20个节点的测试中获得线性可伸缩性性能.
Technologies: 亚马逊网络服务(AWS), Bootstrap, Hadoop, Apache Spark, Python

Data Analyst

2015 - 2015
Cartesian
  • Aided managed analytics efforts promoting best practices within batch workflows and data management.
  • Conducted independent research into big data workflows considering data mining and BI integration.
  • 构建使用api的短数据管道, transforming, loading, 并向BI工具公开数据连接.
Technologies: Alteryx, PostgreSQL, R, Python, Data Analytics, 管理分析

数据分析工程师

2013 - 2015
Daktari诊断
  • Worked as the lead developer of mainstream data processing and data analysis applications in Python for Windows/Mac.
  • Developed a calibration model for the Daktari CD4 testing device improving the system's accuracy by 20-30%.
  • Deployed machine learning models embedded in standalone applications to end users for data classification.
技术:Microsoft SQL Server, JMP, SAS, R, Python

持续边缘和套期保值股票交易策略

http://docs.google.com/presentation/d/1zkbfErfwbJvGBXFj9UWKDvq99wkj6EBvqniA4yFNu68/edit?usp=sharing
This investigation explores reinforcement learning agents as a means to generate a diversified set of strategies that guarantees existing optimal strategies for any market condition. The preliminary result demonstrates that the pool of agents provides the desirable diversity transforming the algorithmic trading challenge into a problem of selection (which may be tackled with AI methods such as evolutionary computing).
2021 - 2022

工商管理高级工商管理硕士

迈阿密大学-迈阿密

2015 - 2017

计算机科学(机器学习)硕士学位

佐治亚理工学院-亚特兰大,乔治亚州

2010 - 2012

地球科学与工程(地球物理学)硕士学位

阿卜杜拉国王科技大学-沙特阿拉伯

2008 - 2010

机械工程学士学位

麻省大学洛厄尔分校

Libraries/APIs

Microsoft HPC, PySpark, TensorFlow, PyTorch, Scikit-learn, XGBoost, Dask, SpaCy

Tools

ChatGPT, Amazon Elastic MapReduce (EMR), Spark SQL, JMP, Impala, Git, Gensim

Languages

Python, Julia, Scala, SQL, R, SAS, JavaScript, Bash, Snowflake

Storage

NoSQL, MongoDB, Oracle SQL, Microsoft SQL Server, Redis, Cassandra, PostgreSQL, HBase, Apache Hive, 数据集成

行业专业知识

Accounting

Paradigms

Functional Programming, Parallel Programming, Distributed Computing, Data Science

Platforms

Docker, Jupyter Notebook, Apache Kafka, Alteryx, Linux, 亚马逊网络服务(AWS)

Frameworks

Bootstrap, Ruby on Rails (RoR), Spark, Apache Spark, Flask, Hadoop, Streamlit

Other

机器学习, 分布式系统, OpenAI GPT-4 API, 金融建模, Web App UI, APIs, 数据架构, Data Modeling, DocumentDB, Dash, Deep Learning, 自然语言处理(NLP), 工程数据, 人工智能(AI), Algorithms, 算法交易, Optimization, 强化学习, 时间序列分析, Forecasting, Cloud, 数值优化, 情绪分析, Neural Networks, Options Trading, Web Scraping, 概率论, Simulations, Finance, Law, 创业, Leadership, Big Data, 软件架构, GPT, 生成预训练变压器(GPT), Data Analytics, 管理分析

有效的合作

如何使用Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

分享你的需求

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

选择你的才能

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

开始你的无风险人才试验

与你选择的人才一起工作,试用最多两周. 只有当你决定雇佣他们时才付钱.

对顶尖人才的需求很大.

Start hiring