Data Scientist

June 08, 2020

Responsibilities / Accountabilities • Accessing, profiling, and analyzing various relational databases, datalake (HDFS), unstructured data (files, images), NoSQL or graph databases

• Applying statistical analysis and visualization techniques to various data, such as hierarchical clustering, Tdistributed Stochastic Neighbor Embedding (t-SNE), LLE, EM, Ward, and PCA

• Generating hypotheses about the underlying mechanics of the business process

• Testing hypotheses using various quantitative methods

• Networking with business subject matter experts and product owners to better understand the business mechanics that generated the data

• Analysis of the data through descriptive, exploratory, inferential, and causal techniques

• Preparation of the data through standardization, imputation, cleansing, outlier detection, and formatting

• Engineering features for continuous and discrete data utilizing domain knowledge

• Applying various ML, DL, Reinforcement Learning (RL) and Advanced Analytics techniques to create supervised, semi-supervised, self-supervised, and unsupervised models

• Evaluation and testing of AI models through cross-validation, A/B testing, bias and fairness evaluation, and Explainability / interpretability

• Utilizing SOTA methods in ML and DL to achieve the superior model performance

• Implementing novel deep learning architectures and employing the Neural Architecture Search (NAS), and Meta Learning techniques

Contribution to AI CoE best practices for AAAI Trust & Transparency, AAAI Lifecycle Management, Data Science, ML Engineering, MLOps Engineering, and Data Leverage through partnerships and economy

• Collaboration with the Data Analysts, other Data Scientists, MLOps Engineers, and Data Engineers to evaluate, implement, and deploy to production enterprise-grade Machine Learning and Deep Learning models

Experience • 5+ years of industry experience as a Data Scientist

• Experience with Big Data, Lambda Architectures (Batch & Stream processing) and visualization

• Experience of working in banks and financial institutions (FinTech experience is a plus)

• 5+ years of experience with Python programming language

• Experience with Scala programming language is a plus

• 3+ years of experience with scalable production grade Data Science (e.g. lifecycle management, experimentation management, model telemetry, and registry)

• 3+ years of experience with Scikit-Learn, Pandas, Seaborn, Numpy, Scipy, LightGBM, AdaBoost, CatBoost, and XGBoost

• 2+ years of experience with Keras, Tensorflow and Pytorch for Deep Learning

• Experience with Apache Kafka for Event Streaming (Confluent platform knowledge is a plus)

• Experience with Apache Spark (Databricks platform knowledge is a plus)

• Knowledge of Graph databases (JanusGraph, Apache TinkerPop or Gremlin)

• Experience with Agile processes and Software Engineering best practices

Education / Research / Publication • A master's degree in Computer Science, Statistics, Mathematics or related fields

• A Ph.D. degree in Computer Science, Statistics, Mathematics or related fields is preferred

• Working in an academic AI research lab is a plus

• Academic publications on Deep Learning, Machine Learning or Operations Research is a plus Soft skills • Strong communication, documentation, storytelling, creativity, and presentation skills

• Strong interpersonal, teamwork, coordination and consensus building skills

• Strong organizational skills, the ability to perform under pressure and to manage multiple priorities with competing demands

Apply for The Job

Show More Jobs

Jobs Alert