YOUR PARTNER FOR A STRONG CAREER IN TECHNOLOGY

Data Lake Architect in Richmond, VA at APEX Systems

Date Posted: 2/4/2018

Job Snapshot

Job Description

Job #:  811019

***Apex Systems is now hiring a Data Lake Architect. If you are interested please contact Caroline McCabe at cmccabe@apexsystems.com*** 

 

Overview:

As part of the team, resource will work in a collaborative environment with internal business owners to understand key business goals, build solutions, and support analytical work.

The candidate will work directly with other team members to build and support analytical models in cloud and distributed environment.

Role & Responsibilities:

  • Work in a collaborative environment manufacturing industry.
  • Work on Azure Data Lake (ADLS) and help in migrating the existing processes and data from Teradata to Azure Data Lake.
  • Develop U-SQL scripts to create views and replicate the existing processes in ADLS.
  • Understand the Business processes and domain data.
  • Quickly understand client needs, develop solutions, and articulate findings to client executives.
  • Development of statistical models, scoring of the models and model governance.
  • Work on Hadoop/Map-Reduce, Big data Technologies like Spark, Flume etc.,
  • Work on scripting languages like Pig and Hive.
  • Work on R/python to perform analysis on client data.
  • Design and build scalable machine learning models to meet the needs of a given client engagement.
  • Assist with the mentorship and development of consultants.

Requirements:

  • 5-7 years professional work experience as a data scientist or on advanced analytics / statistics projects.
  • Degree in Computer Science, Statistics, Economics, Physics, Engineering, Mathematics, or other closely related field is preferred but not mandatory.
  • Experience writing U-SQL scripts.
  • Thorough understanding of Data Lake and grip on implementing basic functionalities like connecting to data from lake, connecting R to Azure Data lake, etc.,
  • Basic knowledge of programming. 
  • Experience working with at least one of the big data distributed programming languages, and ecosystems (e.g. Hadoop/MapReduce, Pig, Hive, Spark etc.)
  • Strong understanding and application of statistical methods and skills: distributions, experimental design, variance analysis, A/B testing, and regression.
  • Good understanding on CPG data.
  • Experience with Azure Data Lake Storage and strong knowledge on U-SQL.
  • Experience with Azure Data lake Analytics.
  • Mastery of one or many of the statistical software, scripting languages, and packages (e.g. R,  Python, Pearl, SAS etc.).
  • Knowledge of or experience working with database systems (e.g. SQL, NoSQL, MongoDB, Postgres, etc.)
  • Experience with PowerPoint and ability to clearly articulate findings and present solutions.
  • Excellent team-oriented and interpersonal skills.