Sr. Platform Engineer- GenAI
Company: Disability Solutions
Location: Ann Arbor
Posted on: October 30, 2024
Job Description:
Base Pay Range: $103,000.00 - $175,100.00 AnnuallyPrimary
Location: USA-MI-Ann Arbor-KLAKLA's total rewards package for
employees may also include participation in performance incentive
programs and eligibility for additional benefits identified below.
Interns are eligible for some of the benefits identified below. Our
pay ranges are determined by role, level, and location. The range
displayed above reflects the minimum and maximum pay for this
position in the primary location identified in this posting. Actual
pay depends on several factors, including location, job-related
skills, experience, and relevant education level or training. If
applicable, your recruiter can share more about the specific pay
range for your preferred location during the hiring process.
Company Overview KLA is a global leader in diversified electronics
for the semiconductor manufacturing ecosystem. Virtually every
electronic device in the world is produced using our technologies.
No laptop, smartphone, wearable device, voice-controlled gadget,
flexible screen, VR device or smart car would have made it into
your hands without us. KLA invents systems and solutions for the
manufacturing of wafers and reticles, integrated circuits,
packaging, printed circuit boards and flat panel displays. The
innovative ideas and devices that are advancing humanity all begin
with inspiration, research and development. KLA focuses more than
average on innovation and we invest 15% of sales back into R&D.
Our expert teams of physicists, engineers, data scientists and
problem-solvers work together with the world's leading technology
providers to accelerate the delivery of tomorrow's electronic
devices. Life here is exciting and our teams thrive on tackling
really hard problems. There is never a dull moment with us.
Group/Division The Information Technology (IT) group at KLA is
involved in every aspect of the global business. IT's mission is to
enable business growth and productivity by connecting people,
process, and technology. It focuses not only on enhancing the
technology that enables our business to thrive but also on how
employees use and are empowered by technology. This integrated
approach to customer service, creativity and technological
excellence enables employee productivity, business analytics, and
process excellence.Job Description/Preferred Qualifications
- Identify and resolve infrastructure gaps to ensure reliable,
efficient, and scalable solutions
- Develop advanced AI/ML infrastructure solutions that enhance
the efficiency of our skilled ML teams
- Design and implement solutions for critical areas, including
distributed storage systems, scheduling systems, high availability
capabilities, and core reliability issues within our large-scale
GPU clusters
- Monitor and optimize the performance of our AI/ML
infrastructure, ensuring high availability, scalability, and
efficient resource utilization
- Develop and deploy automation tools, monitoring solutions, and
operational strategies to streamline infrastructure management and
reduce manual tasks
- Work with various teams, including ML developers, data
engineers, and DevOps professionals, to create a cohesive and
integrated AI/ML infrastructure ecosystem
- Implement and manage GPU infrastructure within Kubernetes
clusters to support high-performance computing and AI/ML tasks
- Deploy and manage open-source GenAI components, such as vector
databases and various AI/ML models, ensuring seamless integration
and optimal performance
- Evaluate and integrate new open-source GenAI tools and
technologies to enhance the platform's capabilities
- Collaborate with the research and development teams to
implement and optimize innovative AI/ML models and algorithms
- Ensure the security and compliance of open-source GenAI
components within the infrastructure
- Leverage High-Performance Computing (HPC) experience to
optimize and manage large-scale AI/ML workloads
- Design, implement, and manage on-premises, cloud, and
hybrid-based ML platforms to support diverse AI/ML workloads and
ensure flexibility and scalability Minimum Qualifications
- Bachelor's Degree or equivalent training/certifications in
Computer Science or related IT field
- Eight (8) years of implementing and maintaining AI/ML
Infrastructure On-Prem environment
- Strong experience with AI/ML infrastructure and tools,
including GPU clusters and Kubernetes
- Proficiency in deploying and managing open-source GenAI
components and vector databases
- Hands-on experience with high-performance computing (HPC)
environments
- Expertise in designing and managing on-premises, cloud, and
hybrid-based ML platforms
- Solid understanding of distributed storage systems, scheduling
systems, and high availability capabilities The company offers a
total rewards package that is competitive and comprehensive
including but not limited to the following: medical, dental,
vision, life, and other voluntary benefits, 401(K) including
company matching, employee stock purchase program (ESPP), student
debt assistance, tuition reimbursement program, development and
career growth opportunities and programs, financial planning
benefits, wellness benefits including an employee assistance
program (EAP), paid time off and paid company holidays, and family
care and bonding leave. KLA is proud to be an Equal Opportunity
Employer. We do not discriminate on the basis of race, religion,
color, national origin, sex, gender identity, gender expression,
sexual orientation, age, marital status, veteran status, disability
status or any other status protected by applicable law. We will
ensure that qualified individuals with disabilities are provided
reasonable accommodation to participate in the job application or
interview process, to perform essential job functions, and to
receive other benefits and privileges of employment. Please contact
us at talent.acquisition@kla.com or at +1-408-352-2808 to request
accommodation.
Keywords: Disability Solutions, Battle Creek , Sr. Platform Engineer- GenAI, Engineering , Ann Arbor, Michigan
Didn't find what you're looking for? Search again!
Loading more jobs...