ByteBridge

Exciting News

Data Center & Cloud Infrastructure and Service Orchestrator Architect

Data Center & Cloud Infrastructure and Service Orchestrator Architect

Position Description

Location:Japan

Company Introduction

ByteBridge is a widely trusted innovator of IT services, including data center, enterprise IT, and unified communications. ByteBridge was founded by a team of passionate dedicated experts with a vision to achieve customer success through technology enablement. We bridge the technical gaps to help enterprises' vision come true and expand their business on a global scale. To date, ByteBridge has been trusted by some of the world's leading international companies.

Position Overview

We are seeking an expert Data Center & Cloud Infrastructure and Service Orchestrator Architect to design and implement the service orchestration layer that will deploy and manage diverse workloads on top of our multi-region cloud infrastructure. This role focuses on creating the intelligent orchestration systems that automate the deployment, scaling, and management of applications, databases, AI/ML services, and other cloud services.

Key Responsibilities
  • Service Orchestration Platform Design
    • Design comprehensive service orchestration platforms for automated workload deployment and management
    • Architect API-driven service provisioning systems with self-service capabilities
    • Design multi-tenant service isolation and resource allocation frameworks
    • Create service lifecycle management systems including deployment, scaling, updates, and decommissioning
  • Workload Orchestration Architecture
    • Design orchestration systems for diverse workload types:
      • Virtual machine provisioning and management
      • Container orchestration using Kubernetes
      • Database service deployment (SQL, NoSQL, distributed databases)
      • Message queue services (Kafka, RabbitMQ, Apache Pulsar)
      • GPU-accelerated AI/ML services and model inference platforms
      • Large Language Model (LLM) fine-tuning and inference services similar to AWS Bedrock
  • AI/ML Service Orchestration
    • Architect AI/ML pipeline orchestration for model training, validation, and deployment
    • Design GPU resource scheduling and allocation systems for distributed training
    • Create model serving infrastructure with auto-scaling and load balancing
    • Design MLOps platforms for continuous integration and deployment of ML models
    • Architect LLM inference services with dynamic scaling and cost optimization
  • Service Discovery and Integration
    • Design service mesh architectures for microservices communication
    • Architect API gateway and service proxy solutions
    • Create service discovery, configuration management, and secrets management systems
    • Design inter-service communication patterns and protocols
  • Automation and DevOps Integration
    • Design CI/CD pipelines integrated with service orchestration platforms
    • Architect GitOps workflows for declarative service management
    • Create policy-based governance and compliance automation
    • Design cost management and resource optimization automation
Required Qualifications
  • Experience
    • 12+ years in distributed systems and service orchestration
    • 8+ years with container orchestration platforms (Kubernetes, Docker Swarm)
    • 6+ years with cloud-native application architecture and microservices
    • 5+ years with AI/ML infrastructure and model deployment systems
  • Service Orchestration Expertise
    • Kubernetes: Expert-level knowledge of Kubernetes architecture, custom resources, operators, and ecosystem tools
    • Container Technologies: Deep understanding of Docker, containerd, container networking, and storage
    • Service Mesh: Experience with Istio, Linkerd, or Consul Connect
    • API Design: Proficiency in REST, GraphQL, and gRPC API design and implementation
    • Workflow Orchestration: Experience with Apache Airflow, Temporal, or similar workflow engines
  • AI/ML and Data Services
    • ML Orchestration: Experience with Kubeflow, MLflow, or similar ML pipeline platforms
    • GPU Computing: Knowledge of CUDA, distributed training frameworks (PyTorch, TensorFlow)
    • Model Serving: Experience with model inference platforms (TensorFlow Serving, TorchServe, NVIDIA Triton)
    • LLM Infrastructure: Understanding of large language model deployment, fine-tuning, and inference optimization
    • Data Platforms: Experience with big data orchestration (Apache Spark, Flink, Kafka)
  • Cloud-Native Technologies
    • DevOps Tools: Proficiency with Jenkins, GitLab CI, ArgoCD, or similar CI/CD platforms
    • Infrastructure as Code: Experience integrating with Terraform, Pulumi, or CloudFormation
    • Monitoring & Observability: Knowledge of Prometheus, Jaeger, Fluentd, and cloud-native monitoring stacks
    • Security: Understanding of container security, RBAC, network policies, and secrets management
  • Programming and Automation
    • Languages: Proficiency in Go, Python, or Java for building orchestration tools and operators
    • Scripting: Advanced skills in Bash, Python for automation and integration scripts
    • Database Technologies: Experience orchestrating various database types (PostgreSQL, MongoDB, Elasticsearch, Redis)
    • Message Queues: Deep knowledge of Kafka ecosystem, RabbitMQ, and event-driven architectures
  • Preferred Qualifications
    • Experience building platform-as-a-service (PaaS) or infrastructure-as-a-service (IaaS) offerings
    • Knowledge of serverless computing platforms and function-as-a-service (FaaS)
    • Experience with edge computing and distributed cloud orchestration
    • Certifications: CKA, CKAD, CKS (Kubernetes), or relevant cloud platform certifications
    • Experience with FinOps and cloud cost optimization
  • Education & Certifications
    • Bachelor's degree in Computer Science, Software Engineering, or related field
    • Master's degree in distributed systems or related area preferred
    • Industry certifications in Kubernetes, cloud platforms, or DevOps practices
  • Key Competencies
    • Strong software engineering and system design skills
    • Deep understanding of distributed systems patterns and microservices architecture
    • Excellent problem-solving abilities for complex orchestration challenges
    • Experience with agile development practices and cross-functional team collaboration
    • Strong communication skills for technical architecture discussions
  • Work Environment
    • Collaborative environment working closely with infrastructure and application teams
    • Opportunity to work with cutting-edge AI/ML and cloud-native technologies
    • Fast-paced environment with focus on automation and self-service capabilities
    • Regular interaction with customer technical teams and requirements gathering