NYU Langone Health is a world-class, patient-centered, integrated academic medical center, known for its excellence in clinical care, research, and education. It comprises more than 200 locations throughout the New York area, including five inpatient locations, a children’s hospital, three emergency rooms and a level 1 trauma center. Also part of NYU Langone Health is the Laura and Isaac Perlmutter Cancer Center, a National Cancer Institute designated comprehensive cancer center, and NYU Grossman School of Medicine, which since 1841 has trained thousands of physicians and scientists who have helped to shape the course of medical history. For more information, go to nyulangone.org, and interact with us on LinkedIn, Glassdoor, Indeed, Facebook, Twitter, YouTube and Instagram.
We have an exciting opportunity to join our team as a Enterprise Data Governance Analyst.
In this role, the successful candidate NYU Langone Health’s Information Management and Governance team is responsible for data architecture, data management, data governance solutions and big data solutions. The healthcare data engineer/analyst will be a key technical member of this team, and will be responsible for big data engineering, data wrangling, data analysis and user support primarily focused on the Cloudera Hadoop platform, but in future extending to the cloud. The data engineer must have strong hands-on technical skills including conventional ETL and SQL skills with programming as well as data science languages such as Python and R, using big data techniques.
- Requirements analysis, planning and forecasting for Hadoop data engineering/ingestion projects
Requirements analysis, planning and forecasting for Hadoop data engineering/ingestion
- Operational support for data ingestion and engineering, including job monitoring; issue resolution; user support
- Coordinate with infrastructure and offsite/offshore teams
- Design and implement optimized Hadoop and big data solutions for data ingestion, data processing, data wrangling, and data delivery
- Share subject matter expertise on Hadoop-related concepts and use
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for efficient extraction, transformation, and loading of data from a wide variety of data sources
- Assist users with technical issues related to their use of Hadoop.
- Build data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Create high-quality technical and user documentation
To qualify you must have a Bachelors Degree or higher in Computer Science or a related field Minimum 10 years of total IT experience including 5+ years of experience with similar responsibilities and 3+ years of Hadoop experience Strong knowledge of Cloudera Hadoop fundamentals, including HDFS, Hive, Impala, Parquet, Sentry, Sqoop, Pig, Flume, Oozie, Yarn, Zookeeper, Spark, Hive Metastore, Spark, Solr, Kudu Strong ETL and data engineering/ingestion experience with ingesting diverse data from various sources including relational databases and files (text, csv) Strong knowledge of SQL and databases, particularly Oracle and Microsoft SQL Server Strong knowledge of querying Hadoop data using Hive and Impala, as well as query optimization Experience with at least one data integration (ETL) tool in a conventional or big data setting Strong knowledge of Unix/Linux including scripting Good knowledge of Java; knowledge of Groovy helpful Knowledge of Python or R, and willingness to learn languages Some experience with cloud data management with AWS or GCP Ability to lead resources with greater technical skills Creative and innovative approach to problem-solving Excellent communication and interpersonal skills at all levels of the organization Experience with agile development and tools Willingness to explore new alternatives or options to solve data engineering and data mining issues, and utilize a combination of industry best practices, innovations and experience to get the job done Experience performing root cause analysis on internal and external data and processes to answer specific business questions and find opportunities for improvement
Experience with healthcare data, particularly with providers and academic medical centers is a strong plus Knowledge of or familiarity with natural language processing (NLP) and machine learning Experience with agile development and tools Basic knowledge of statistical techniques Exposure/experience with Cloud data management and services (AWS, GCP)
Qualified candidates must be able to effectively communicate with all levels of the organization.
NYU Langone Health provides its staff with far more than just a place to work. Rather, we are an institution you can be proud of, an institution where you’ll feel good about devoting your time and your talents.
NYU Langone Health is an equal opportunity and affirmative action employer committed to diversity and inclusion in all aspects of recruiting and employment. All qualified individuals are encouraged to apply and will receive consideration without regard to race, color, gender, gender identity or expression, sex, sexual orientation, transgender status, gender dysphoria, national origin, age, religion, disability, military and veteran status, marital or parental status, citizenship status, genetic information or any other factor which cannot lawfully be used as a basis for an employment decision. We require applications to be completed online.
If you wish to view NYU Langone Health’s EEO policies, please click here. Please click here to view the Federal “EEO is the law” poster or visit https://www.dol.gov/ofccp/regs/compliance/posters/ofccpost.htm for more information. To view the Pay Transparency Notice, please click here.