Data Site Reliability Engineer | Japan Jobs | Fidel Consulting KK

Data Site Reliability Engineer

Job Id : 9896
Posted : 2026-04-14
Industry : Insurance
Employment Type : Full Time, Permanent
Required Skills : Python, Azure DevOps, SQL Database, Grafana, Azure Copilot, Japanese JLPT N2, English 2
City : Tokyo ( Hybrid)
State : Tokyo ( Hybrid)
Country : Japan
Annual Salary : ¥8,000,000 ~ ¥10,000,000

Job Description

Appealing Points:

  • Ensure reliability, scalability, and performance of large-scale data platforms across hybrid on-premises and cloud environments while driving operational automation.
  • Hands-on experience with Python, Spark, Bash, Azure services, Docker/Kubernetes, and ELK stack for monitoring, automation, and performance optimization.
  • Collaborate with global teams and vendors to enhance platform stability, lead SRE initiatives, and support team development and career growth.

Annual salary: 8 Million and above

Job Responsibilities:

  • System Reliability and Performance: Ensure the reliability, scalability, and performance of our data platform and services, including monitoring, troubleshooting, and resolving issues.
  • Service Design and Implementation: Collaborate with engineering teams to design, implement, and operate large-scale systems, including developing software that automates and streamlines our operations.
  • Automation and Scripting: Develop and maintain automation scripts and tools to streamline operations, improve efficiency, and reduce manual errors.
  • Monitoring and Alerting: Design and implement monitoring and alerting systems to ensure timely detection and resolution of issues.
  • Collaboration and Communication: Work closely with engineering teams, product managers, and other stakeholders to ensure that systems and services meet business requirements and are aligned with company goals.
  • Incident Response and Management: Participate in incident response and management, including root cause analysis, post-mortems, and implementation of corrective actions.
  • Documentation and Knowledge Sharing: Maintain accurate and up-to-date documentation of systems, services, and processes, and share knowledge with team members to promote collaboration and improvement.
  • Other Day-to-day operations for data platform with internal / global governance
  • Vendor management
  • As one of data engineering leads, enhance and improve team capability and maturity as well as support career development of each team member.
  • Delivery with speed and automation in agile way

Job Qualification:

  • Bachelor's or advanced degree in Computer Science, Engineering, or a related field.
  • Minimum 3 years of experience as a Site Reliability engineer supporting data platform or different application and application in a Hybrid-cloud platforms with mix of On-Prem and Azure.
  • Strong scripting and programming skills in languages such as Python, Spark, Bash, or PowerShell
  • Hands on experience on usage of ELK stack, observability tools like Grafana, Kibana, Splunk etc.
  • Experience in Azure Public cloud services.
  • Analyze application performance, performance tuning, and ensure high availability and stability of platform.
  • Good hands-on experience with SQL and experience in No-SQL.
  • Essential knowledge of core infrastructure technologies (Network, DNS, Firewalls, LB, Active Directory, RDBMS, Windows/RHEL, Infra-security and etc.)
  • Knowledge of containerization and container orchestration platforms (Docker, Kubernetes), Terraform etc.
  • Excellent communication skills.
  • Strong analytical and problem-solving skills to identify and resolve issues in Production.

Technical requirement

  • Python, Spark, Bash, PowerShell
  • Azure Data Lake Gen2, Data Factory, Synapse Analytics (Data Warehouse, Spark, Pipeline), SQL Database / MI, Cosmos DB, Fabric
  • Azure Application Insight, Azure Log Analytics, Splunk, Grafana, App Dynamics, ELK, Azure Monitor
  • Azure DevOps, Azure Repos, Azure Container Repositories
  • Docker, Kubernetes, AKS
  • Service Now
  • GitHub / Azure Copilot, LLMs

Language: Business level Japanese (JLPT N2) and Business level English

Company Description 

Our Client has operations in more than 40 countries and holds leading market positions in the United States, Japan, Latin America, Asia, Europe, and the Middle East. We are ranked #43 on the Fortune 500 list for 2018. With over 150 years of experience, the companies offer life, accident and health insurance, retirement, and savings products through agents, third-party distributors such as banks and brokers, and direct marketing channels. Our name is recognized and trusted by approximately 100 million customers worldwide and we serve more than 90 of the top 100 FORTUNE 500 companies in the United States.


AI Job Matching Girl icon

All you have to do is upload your resume!
AI will find the list of jobs matching your career history.

LATEST NEWS

POPULAR JOBS