devouringly The Associate, Data Scientist plays a critical role in making sure the enterprise data and advanced analytics strategy is envisioned by being comfortable working across the full spectrum of machine learning models, database design, and predictive analytical solutions.
- Use Big Data tools (Hadoop, Spark, Azure) to conduct the analysis of billions of customer transaction records.
- Write software to clean and investigate large, data sets of numerical and textual data.
- Write query to filter data or to join multiple data sets.
- Denormalize data from multiple disparate data sets.
- Read and/or create a Hive or an HCatalog table from existing data in HDFS.
- Integrate with external data sources and APIs to discover interesting trends using Salesforce.com, Encompass, and many other application data sets.
- Build machine learning models from development through testing and validation.
- Design rich data visualizations to communicate complex ideas to business stakeholders.
- Import and export data between an external RDBMS and the Hadoop Cloudera cluster, including the ability to import specific subsets, change the delimiter and file format of imported data during ingest and alter the data access pattern or privileges.
- Investigate the impact of new technologies on the future of digital banking and the financial world of tomorrow.
- Ability to communicate both verbally and written to employees in the business units utilizing non-technical language.
- Experience with machine learning.
- Ability to develop in Python and or Scala language or R.
- Strong experience utilizing Spark and Hadoop.
- Strong experience utilizing SQL and HIVE tables.
- Knowledge of Machine Learning and Computational Statistics.
- Knowledge of Microsoft Azure preferred.
- Knowledge of Impala with Kudo (interactive SQL).
- Familiarity with BI Visualization tools like MS PowerBI.
- Experience creating a data-driven culture and impactful data strategies.
- B.S. in computer science or a related IT degree with a minimum of one year of information technology experience designing and developing application programs using Python and Scala; and additional knowledge gained by attending a formalized training program.
- Must have experience with Big Data projects and developing data engineering solutions.
- Minimum of two years’ experience with machine learning.
- Knowledge of Trifacta Data Wrangling software, Azure machine learning experience preferred.
- Python development experience preferred.