Our Works

Published on 2024/06/11

Performance Optimization

Enhanced system response speed and stability through database and delivery route optimization.

Speeding Up Search Responses (Eliminating N+1)

In a ticket management SaaS, we eliminated N+1 queries and optimized Preload / IN clauses, improving list API response time from 1.4s → 0.2s (about 7x faster) and reducing DB query count by 80%.

Perspective Details
Issue A large number of N+1 queries were occurring in the list API, issuing a huge number of SELECTs for each page fetch.
Action Redesigned the data access layer and fetched related data in bulk using Preload. Optimized complex-condition lookups with batched retrieval using IN clauses.
Observation Introduced structured query logs keyed by request correlation ID, visualizing query count, type, and latency in real time.
Result / Outcome Reduced query count, stabilizing DB load and connection pool utilization. Additionally, established a state where performance regressions can be continuously detected.
Read more →

Recovery from MySQL Failure and Stabilization of Write Performance

Resolved the batch I/O load that had been causing production DB failures by monthly table partitioning and INSERT optimization, shortening batch time by 70% and stabilizing free memory. No recurrence for the following three months.

Perspective Content
Issue The production DB failed during business hours, causing resend retries to spike and degrading overall response. In addition, visualization was insufficient, so it was impossible to detect the risk of recurrence in advance.
Initial response Immediately set up Aurora automatic failover and read replica distribution, and established continuous monitoring of Slow Query / CloudWatch metrics / error logs.
Investigation Performed correlation analysis of CloudWatch metrics and event logs, and identified that the batch process’s DELETE→INSERT pattern was putting pressure on memory and I/O.
Action Partitioned the table by month and revamped the batch to TRUNCATE + multi-row INSERT. Operationally, established a combined practice of phased migration, instant rollback, and weekly reporting.
Results / Outcomes Suppressed the risk of unplanned failover recurrence. Stabilized batch time and load fluctuation, improving overall system availability and decision-making speed.
Read more →

Reducing origin load by introducing a reverse proxy (Nginx)

Introduced an Nginx reverse proxy between the CDN and the app to implement short-TTL caching and connection reuse. Reduced origin-reaching requests by 40% and stabilized CPU load and response variability.

Perspective Content
Issue Dynamic APIs were concentrated on the application servers, causing persistently high CPU usage and response variability. Due to the direct connection from CDN to app, the system could not absorb request load.
Action Placed an Nginx relay layer between the CDN and the app, and introduced short-TTL caching and Keep-Alive connection reuse.
Operations Adopted a phased rollout with a design that allows immediate rollback. Visualized HIT / MISS and continuously tuned TTL and excluded paths.
Result / Outcome Significantly reduced origin-reaching requests, stabilizing application CPU load and response variability. Improved resilience during peak times.
Read more →

Developer Productivity & Quality Automation

Maintained continuous development velocity through quality assurance automation and build pipeline improvements.

Building a Three-Layer Test Foundation that Supports Continuous Improvement

For an environment that depended on manual verification, we built a three-layer test foundation of Unit / Integration / E2E to prevent regressions, enabling continuous large-scale refactors and upgrades.

Perspective Content
Issue Because we depended on manual verification, regressions occurred, verification costs increased, and procedures became person-dependent, which stalled the pace of improvements and large-scale upgrades.
Response Clearly defined three layers of Unit / Integration / E2E, and prepared an execution environment (CI, DB, mocks, data seeding) that makes tests easy to implement.
Operation Reduced writing and maintenance costs through test templating and shared Fixtures / Builders.
Established naming conventions that let you understand the cause of failure at a glance.
Results / Outcomes By achieving a state where we can have “confidence that nothing is broken,” we can safely perform refactoring and dependency upgrades. Manual checks were also greatly reduced.
Read more →

Improving development speed by renewing the build environment

In a monorepo that assumed Webpack, heavy builds and HMR were stalling development. By gradually migrating to Rspack, we significantly improved perceived speed and the review cycle.

Perspective Content
Issue In a large-scale monorepo based on Webpack, slow HMR, heavy initial/CI builds, and unstable caching had become chronic.
As a result, review queues formed easily, and it took a long time to reflect and verify changes.
Action Gradually switched to Rspack, which is almost compatible with Webpack.
Reorganized settings such as transforms, source maps, and code splitting, and redesigned the configuration so that caching works stably.
Migrated with zero downtime while checking compatibility via automated tests and E2E.
Operation Ran dual builds with Webpack / Rspack on each PR and automatically checked differences (especially bundle size).
Visualized metrics such as build time and cache efficiency, and maintained a state where continuous improvement is possible.
Result / Outcome Clearly improved perceived HMR speed and build time. (Example: incremental build -81%, initial build -67%).
Reviews became less likely to stall, and the development cycle became lighter and more stable.
Was able to gain benefits step by step without breaking existing mechanisms such as SSR / Storybook / Sentry.

Enhanced User Experience

Improved usability and reliability from the user's perspective, including search experiences and reservation systems.

Improve the search experience and reduce “0 results” (query relaxation + ranking adjustment)

Improve a search experience that easily results in “0 results” due to exact match search, by gradually relaxing conditions and adjusting scoring. Make it possible to naturally present alternative suggestions, achieving convincing results while reducing churn.

Perspective Content
Issue Because of exact match search, even slightly off conditions easily led to “0 results,” causing users to repeat searches with small changes and eventually churn.
Approach Introduced a mechanism that scores search candidates and displays them in order of “closest to the conditions.” Applied score tuning (function_score + gauss) so that scores decrease smoothly even when distance or match level slightly differ.
Operation Continuously monitored relaxation steps / weights / thresholds via a dashboard, and kept validating and tuning side effects through A/B testing.
Results Significantly reduced “0-result searches,” decreased the number of re-searches, and increased session duration.
By improving acceptance of alternative suggestions, the search experience became smoother without disruption.
Impact Delivered a search experience with a sense of substance, contributing to improved inventory utilization and booking rates. Also secured room for future ML-ization / expansion to vector search.
Read more →

Asynchronizing Search Index Updates to Improve Peak-Time Resilience

We revamped ElasticSearch updates from synchronous to an asynchronous Outbox→Pub/Sub→Indexer pipeline, achieving both stability and freshness of inventory reflection even at peak times. We then standardized this approach internally and rolled it out to other APIs, enabling load leveling and reduced operational costs.

Perspective Content
Issue Inventory and price updates were synchronously reflected to ElasticSearch, so write processing was delayed or timed out when load concentrated. Availability became unstable in spike situations such as sales.
Response Revamped to an asynchronous pipeline using Outbox → Pub/Sub → Indexer. In addition, adopted partial upsert to make updates lighter and apply only minimal diffs.
Outcome While maintaining stability of the write experience, we were able to reflect updates to ElasticSearch without significant delay. Inventory freshness and search reliability now coexist even under peak load.
Ripple effects This asynchronous update method became a standard pattern within the company. By rolling it out horizontally to other APIs and projects, we built an update foundation that combines load leveling with ease of extension, continuously reducing operational costs.
Read more →

Maintain inventory consistency and sales opportunities by automatically releasing expired reservations

In response to the issue of expired reservations remaining in inventory, designed Redis TTL + background release jobs to enable automatic recovery. Reduced the unreleased rate from 7% to almost 0%, preventing loss of sales opportunities.

Perspective Details
Issue Some reservations were not released and remained, causing a state where inventory “appeared fully booked.” This led to lost sales opportunities and required staff to manually release reservations.
Response Introduced a mechanism to automatically release expired reservations using Redis TTL (expiration) and background jobs.
・ Controlled with a unique key to prevent duplicate processing of the same reservation.
・ Redis acts as the trigger, while the actual inventory updates are handled by the DB.
Visualization Turned the number of pending reservations and the elapsed time until release into metrics, and constantly monitored behavior.
Outcome Unreleased reservations were improved to almost zero. Inventory is now updated quickly and accurately, preventing loss of sales opportunities.
Effect Reservation API responses became more stable and lock waits decreased. Manual release work became almost unnecessary.
Read more →