Overview
Skills
Job Details
Data Scientist
SWA Dallas office, TX (onsite) - inperson interview
rate c2c or Job Description: Data Scientist
We are seeking a skilled and motivated Data Science Supervisor to lead a team of data scientists and analysts in delivering impactful insights and data-driven solutions. The ideal candidate will have a strong background in data science, machine learning, and statistical analysis, paired with excellent leadership and project management skills. This role requires the ability to manage projects, mentor team members, and collaborate effectively with cross-functional teams to drive business decisions.
Additionally, we are looking for a talented Data Scientist with expertise in unsupervised learning techniques to extract meaningful patterns, insights, and structures from unlabeled data, as well as a Data Scientist with expertise in Reinforcement Learning (RL) to develop, implement, and optimize RL algorithms for complex real-world problems. Both roles require a strong foundation in statistical analysis, machine learning, data visualization, and collaboration with cross-functional teams to create intelligent systems that learn and adapt, driving innovative solutions and data-driven decision-making.
Required Technical Skills:
- Widely used for ML, with libraries like TensorFlow, PyTorch, Scikit-learn, Pandas, and NumPy.
- Statistical analysis and data visualization.
- SQL For querying and managing relational databases.
- Machine Learning Frameworks and Libraries
- Tensor Flow and Keras: For deep learning model development.
- PyTorch: Popular for research and production in DL.
- Scikit-learn: Essential for traditional ML algorithms.
- For boosting models.
- Hugging Face Transformers: For NLP tasks.
- Pandas: Data manipulation and analysis.
- NumPy: Numerical computing.
- Apache Spark and Hadoop: For big data processing.
- ETL Tools: To handle data pipelines.
- Knowledge of data wrangling, cleaning, and pre-processing.
- Decision Trees: Tree-like structures used for classification and regression tasks.
- Random Forests: An ensemble of decision trees that improves accuracy and reduces overfitting.
- Gradient Boosted Trees: Combines weak learners (trees) sequentially to minimize loss (e.g., XGBoost, LightGBM, CatBoost).
- K-Means Clustering: Divides data into
- k clusters by minimizing intra-cluster variance.
- Hierarchical Clustering: Builds a tree of clusters (dendrogram) based on the hierarchy.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Groups points based on density, identifying noise and outliers.
- Gaussian Mixture Models (GMMs): Uses probabilistic models assuming data comes from a mixture of Gaussian distributions.
Soft Skills:
-
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities.
Ability to work in a fast-paced, dynamic environment