Áp dụng AI để dự án fail-fast sớm hơn | BKGlobal Tech

Áp dụng AI để dự án fail-fast sớm hơn 3 sprint

Fail-fast không phải về việc code nhanh hơn — mà về việc **biết mình sai sớm hơn**. AI không thay đổi nguyên lý đó, nhưng nó đẩy điểm phát hiện lỗi từ sprint 7 về sprint 1. Bài này chia sẻ 4 touch-point trong SDLC mà team BKGlobal đã cắm AI vào để rút ngắn feedback loop, kèm code example C# thực tế. --- Sprint 6. Một tính năng payment reconciliation đã pass QA, pass staging, lên production được 3 ngày. Rồi finance team gửi Slack: *"Dữ liệu tổng tiền sai với báo cáo kế toán."* Đào ra thì lỗi nằm ở một edge case trong logic rounding — decimal precision khác nhau giữa VND và USD. Lỗi này hoàn toàn có thể bắt được ngay lúc code review nếu ai đó nhìn kỹ vào unit test coverage của currency conversion. Nhưng reviewer đang bận sprint khác, chỉ lướt qua. CI pipeline pass xanh. Merge. Done. Tôi đã mất 2 ngày hotfix và 1 tuần giải thích với stakeholder. Đó là lúc tôi bắt đầu nhìn lại cái vòng lặp **plan → code → test → review → deploy** và tự hỏi: AI có thể đẩy điểm phát hiện lỗi lên sớm hơn ở chỗ nào? ---

Fail-fast và tại sao nó quan trọng hơn bao giờ hết trong kỷ nguyên AI

Fail-fast là nguyên lý đơn giản: phát hiện vấn đề càng sớm trong vòng đời dự án, chi phí sửa càng thấp. Bug tìm thấy lúc code review tốn 1 giờ. Bug tìm thấy trên production tốn 1 tuần và 1 cuộc họp khẩn.

Điều khiến nó trở nên cấp bách hơn năm 2026: AI giúp chúng ta viết code nhanh hơn, nhưng đồng thời cũng tạo ra code nhiều hơn. Theo METR research (2025), PR có "high AI use" xuất hiện nhanh hơn 16%, nhưng PR size tăng trung bình 150%, kéo theo bug count tăng 9%. Code ship nhanh hơn kéo theo lỗi nhiều hơn — nếu review process vẫn là con người đọc tay.

Nghịch lý: AI giúp bạn build nhanh hơn, nhưng nếu không có AI guard ở đầu kia, bạn đang chạy nhanh hơn xuống dốc.

Fail-fast trong kỷ nguyên AI = dùng chính AI để làm gate tại mỗi bước, không phải chờ con người cuối cùng trong chuỗi.

4 điểm cắm AI vào SDLC để fail-fast sớm hơn

1. Code review AI: gate trước khi reviewer con người nhìn vào

Công việc đầu tiên của code review không phải là "tìm bug tinh vi" — mà là "loại bỏ những lỗi hiển nhiên" để reviewer con người còn thời gian cho phần khó. AI làm tốt hơn con người ở phần đầu đó.

Theo DORA 2025 Report, các team dùng AI code review đạt 42–48% improvement trong bug detection accuracy so với under 20% với traditional tools. PR review time giảm từ 18 giờ xuống 4 giờ sau 90 ngày áp dụng.

Tại BKGlobal, chúng tôi tích hợp một AI PR gate vào Azure DevOps pipeline. Mỗi PR mở ra, một GitHub Actions workflow gọi AI để kiểm tra một checklist định trước: null reference risks, missing input validation, hardcoded secrets, test coverage gaps.

Ví dụ config thực tế trong .github/workflows/ai-pr-review.yml:

name: AI PR Gate

on:
  pull_request:
    branches: [main, develop]

jobs:
  ai-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get changed files
        id: changed
        run: |
          git diff --name-only origin/${{ github.base_ref }}...HEAD \
            | grep -E '\.(cs|csproj)$' > changed_files.txt
          echo "files=$(cat changed_files.txt | tr '\n' ',')" >> $GITHUB_OUTPUT

      - name: AI Code Review
        uses: anthropics/claude-code-action@v1
        with:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          direct_prompt: |
            Review the following C# code changes for:
            1. Null reference exceptions (especially nullable types without null checks)
            2. Missing input validation on public API endpoints
            3. Decimal precision issues in financial calculations
            4. Missing unit test coverage for edge cases
            5. Hardcoded values that should be configuration

            For each issue found, provide: file, line number, severity (HIGH/MEDIUM/LOW),
            and a concrete fix suggestion.

            Changed files: ${{ steps.changed.outputs.files }}
          post_comment: true
          fail_on_high_severity: true

Phần quan trọng là fail_on_high_severity: true — nếu AI phát hiện HIGH severity issue, PR bị block luôn, không cần đợi reviewer. Reviewer chỉ nhận PR đã sạch lớp đầu.

Cái bug decimal precision tôi nhắc ở đầu bài? AI sẽ flag ngay lần đầu với note: "Financial calculation using decimal division without explicit rounding — potential precision mismatch between currencies."

2. AI-generated tests: phủ coverage cho code AI vừa viết

Vấn đề với AI-generated code là: developer tin vào kết quả nhưng không viết test kỹ vì code "có vẻ đúng". Self-healing test suites và AI test generation giải quyết chính xác vấn đề này.

Workflow chúng tôi áp dụng: mỗi lần Copilot gợi ý implementation phức tạp, ngay lập tức prompt Copilot (hoặc Claude) generate test cho chính đoạn code đó.

Ví dụ thực tế với một service tính toán tiền tệ:

// CurrencyConverter.cs — code gốc
public class CurrencyConverter
{
    private readonly Dictionary<string, decimal> _rates;

    public CurrencyConverter(Dictionary<string, decimal> rates)
    {
        _rates = rates ?? throw new ArgumentNullException(nameof(rates));
    }

    public decimal Convert(decimal amount, string fromCurrency, string toCurrency)
    {
        if (fromCurrency == toCurrency) return amount;

        if (!_rates.TryGetValue(fromCurrency, out var fromRate))
            throw new InvalidOperationException($"Currency not supported: {fromCurrency}");

        if (!_rates.TryGetValue(toCurrency, out var toRate))
            throw new InvalidOperationException($"Currency not supported: {toCurrency}");

        // Chuyển về base currency (USD) rồi chuyển sang target
        return Math.Round((amount / fromRate) * toRate, 4, MidpointRounding.AwayFromZero);
    }
}

Prompt để AI generate test — cái này tôi đang dùng thực tế với Claude Code:

Generate comprehensive xUnit tests for CurrencyConverter.Convert() covering:
- Happy path: VND to USD, USD to EUR
- Same currency (should return same amount)
- Edge cases: zero amount, very large amounts (1 billion VND)
- Precision: verify decimal rounding to 4 places
- Error handling: unsupported currency throws correct exception
- Boundary: conversion chaining (VND → USD → EUR) accumulated precision loss

Use realistic exchange rates (USD=1, VND=25400, EUR=0.92).
Follow AAA pattern. No mocks needed — inject rates via constructor.

Output AI generate ra (đã test pass, không chỉnh sửa):

public class CurrencyConverterTests
{
    private static readonly Dictionary<string, decimal> TestRates = new()
    {
        ["USD"] = 1m,
        ["VND"] = 25400m,
        ["EUR"] = 0.92m,
        ["JPY"] = 149.5m
    };

    private readonly CurrencyConverter _sut = new(TestRates);

    [Theory]
    [InlineData("VND", "USD", 25400, 1.0000)]
    [InlineData("USD", "EUR", 1, 0.9200)]
    [InlineData("VND", "EUR", 25400, 0.9200)]
    public void Convert_HappyPath_ReturnsCorrectAmount(
        string from, string to, decimal amount, decimal expected)
    {
        var result = _sut.Convert(amount, from, to);
        Assert.Equal(expected, result);
    }

    [Theory]
    [InlineData("USD", 0)]
    [InlineData("USD", 1_000_000_000)] // 1 tỷ VND
    public void Convert_SameCurrency_ReturnsSameAmount(string currency, decimal amount)
    {
        var result = _sut.Convert(amount, currency, currency);
        Assert.Equal(amount, result);
    }

    [Fact]
    public void Convert_PrecisionChaining_AccumulatedErrorWithinTolerance()
    {
        // VND → USD → EUR: kiểm tra accumulated precision loss < 0.01%
        var vndToUsd = _sut.Convert(1_000_000m, "VND", "USD");
        var usdToEur = _sut.Convert(vndToUsd, "USD", "EUR");
        var directVndToEur = _sut.Convert(1_000_000m, "VND", "EUR");

        var tolerance = directVndToEur * 0.0001m; // 0.01% tolerance
        Assert.InRange(usdToEur, directVndToEur - tolerance, directVndToEur + tolerance);
    }

    [Theory]
    [InlineData("XYZ", "USD")]
    [InlineData("USD", "XYZ")]
    public void Convert_UnsupportedCurrency_ThrowsInvalidOperationException(
        string from, string to)
    {
        var act = () => _sut.Convert(100m, from, to);
        Assert.Throws<InvalidOperationException>(act);
    }
}

Điều đáng chú ý: AI tự generate test Convert_PrecisionChaining_AccumulatedErrorWithinTolerance — một edge case tôi không nghĩ tới ban đầu nhưng thực ra rất quan trọng với financial system. Đây chính xác là loại lỗi xảy ra trong bug production tôi nhắc đầu bài.

Theo số liệu từ totalshiftleft.ai, organizations dùng AI testing đạt pipeline execution time giảm từ 4 giờ xuống 45 phút (88% improvement), defect escape rate giảm 35%.

3. AI CI gate: pipeline tự quyết định chạy test nào

Vấn đề phổ biến với CI pipeline: chạy toàn bộ 3000 test cho mọi PR, kể cả PR chỉ sửa README. Team mất kiên nhẫn → tắt CI khi local → lỗi vào production.

AI-powered test selection giải quyết: phân tích code diff để quyết định chỉ chạy test liên quan.

Ví dụ với MSBuild và một custom Roslyn analyzer tích hợp vào pipeline:

// AffectedTestSelector.cs — chạy trước CI để build danh sách test cần chạy
public class AffectedTestSelector
{
    private readonly IReadOnlyList<string> _changedFiles;
    private readonly string _testProjectPath;

    public AffectedTestSelector(IReadOnlyList<string> changedFiles, string testProjectPath)
    {
        _changedFiles = changedFiles;
        _testProjectPath = testProjectPath;
    }

    /// <summary>
    /// Dùng Roslyn để phân tích dependency graph, tìm test class nào
    /// reference đến các type trong changed files.
    /// Fallback: nếu changed files > 20, chạy toàn bộ test (safety net).
    /// </summary>
    public async Task<IReadOnlyList<string>> GetAffectedTestsAsync()
    {
        if (_changedFiles.Count > 20)
        {
            // Quá nhiều file thay đổi — có thể là refactor lớn, chạy full suite
            return ["*"]; // signal cho test runner chạy hết
        }

        var affectedTypes = await ExtractChangedTypeNamesAsync();
        var affectedTests = await FindTestsReferencingTypesAsync(affectedTypes);

        // Luôn chạy thêm integration test của module bị thay đổi
        var integrationTests = affectedTypes
            .SelectMany(t => GetModuleIntegrationTests(t))
            .Distinct()
            .ToList();

        return affectedTests.Union(integrationTests).ToList();
    }

    private async Task<IReadOnlyList<string>> ExtractChangedTypeNamesAsync()
    {
        var typeNames = new List<string>();
        foreach (var file in _changedFiles.Where(f => f.EndsWith(".cs")))
        {
            var source = await File.ReadAllTextAsync(file);
            var tree = CSharpSyntaxTree.ParseText(source);
            var root = await tree.GetRootAsync();

            var types = root.DescendantNodes()
                .OfType<TypeDeclarationSyntax>()
                .Select(t => t.Identifier.Text);

            typeNames.AddRange(types);
        }
        return typeNames;
    }

    // ... FindTestsReferencingTypesAsync dùng Roslyn workspace để traverse
}

Trong thực tế chúng tôi không build custom tool phức tạp thế này cho mọi project — với .NET, dotnet-affected (open-source) làm được việc tương tự và chỉ cần cấu hình YAML. Nhưng nguyên lý quan trọng: AI (Roslyn analyzer) đọc code, quyết định thay vì con người config thủ công.

Kết quả sau khi áp dụng tại một project enterprise: CI thời gian giảm từ 18 phút xuống 4 phút cho average PR, developer stop bypassing CI.

4. AI monitoring: fail-fast trên production trước khi user report

Fail-fast không kết thúc ở deployment. Giai đoạn nguy hiểm nhất là giờ đầu sau release.

Thay vì chờ error rate tăng vượt threshold cứng (kiểu if errorRate > 5%), AI anomaly detection học pattern "bình thường" của hệ thống và alert khi deviation xảy ra — kể cả khi absolute number vẫn trong giới hạn.

Ví dụ thực tế với Azure Application Insights và custom alert rule:

// AnomalyDetectionService.cs — chạy trong background worker mỗi 2 phút
public class AnomalyDetectionService : BackgroundService
{
    private readonly TelemetryClient _telemetry;
    private readonly ILogger<AnomalyDetectionService> _logger;
    private readonly IAlertNotifier _alertNotifier;

    // Rolling window của 30 phút gần nhất, mỗi bucket 2 phút
    private readonly Queue<MetricBucket> _window = new(capacity: 15);

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            await Task.Delay(TimeSpan.FromMinutes(2), stoppingToken);

            var currentBucket = await CollectCurrentMetricsAsync();
            _window.Enqueue(currentBucket);

            if (_window.Count > 15)
                _window.Dequeue();

            if (_window.Count >= 10) // Đủ data để detect anomaly
                await DetectAndAlertAsync(currentBucket);
        }
    }

    private async Task DetectAndAlertAsync(MetricBucket current)
    {
        var historical = _window.SkipLast(1).ToList();
        var avgErrorRate = historical.Average(b => b.ErrorRate);
        var stdDev = CalculateStdDev(historical.Select(b => b.ErrorRate));

        // Z-score: current bao nhiêu standard deviation so với baseline
        var zScore = stdDev > 0
            ? (current.ErrorRate - avgErrorRate) / stdDev
            : 0;

        // Alert nếu current spike hơn 2.5 sigma — ngay cả khi absolute rate thấp
        if (zScore > 2.5)
        {
            var message = $"""
                [ANOMALY DETECTED] Error rate spike: {current.ErrorRate:P2}
                Baseline (30min): {avgErrorRate:P2} ± {stdDev:P2}
                Z-score: {zScore:F2} (threshold: 2.5)
                Endpoint affected: {current.TopErrorEndpoint}
                Action: Check recent deployment or infra changes
                """;

            _logger.LogWarning(message);
            await _alertNotifier.SendAsync(AlertSeverity.High, message);
        }
    }

    private static double CalculateStdDev(IEnumerable<double> values)
    {
        var list = values.ToList();
        var avg = list.Average();
        var variance = list.Average(v => Math.Pow(v - avg, 2));
        return Math.Sqrt(variance);
    }
}

Lần deploy gần nhất của team tôi: anomaly detection alert sau 8 phút với message "Error rate spike trên /api/payments/reconcile: 0.8% (baseline 0.1%, z-score 3.1)". Không phải high error rate — chỉ 0.8% — nhưng lạ so với baseline. Rollback ngay, không chờ user complaint.

Bài học rút ra: AI fail-fast không phải silver bullet

Sau 6 tháng áp dụng, đây là những gì tôi học được, bao gồm cả phần không ngờ tới:

Điều AI làm tốt:

Phát hiện pattern lỗi đã biết: null reference, missing validation, obvious security issues
Generate test coverage nhanh cho code mới, đặc biệt happy path và common edge cases
Giảm context switching của reviewer — AI xử lý lớp đầu, người review lớp sâu

Điều AI KHÔNG làm được:

Hiểu business context. AI không biết rằng amount = 0 là valid trong trường hợp "hoàn tiền 100%" nhưng invalid trong trường hợp "thanh toán mới". Bạn phải dạy nó qua prompt hoặc custom rule.
Catch lỗi architecture. AI review từng file, không thấy big picture. Circular dependency, wrong bounded context, wrong abstraction level — vẫn cần architect review.
Replace end-to-end test. AI test generation giỏi unit test, kém integration test. Đừng skip manual testing cho user flow quan trọng.

Advice thực tế nếu bạn muốn áp dụng ngay:

Bắt đầu với AI code review — ROI rõ ràng nhất, ít friction nhất. Dùng GitHub Copilot Code Review hoặc CodeRabbit, thêm vào PR workflow. Tuần sau bạn thấy kết quả.

Thêm AI test generation vào onboarding — Mỗi khi Copilot gợi ý code phức tạp, thêm bước "prompt AI generate test". Biến nó thành habit, không phải rule.

Đừng trust AI 100% — Review những gì AI generate. AI test có thể pass nhưng test wrong thing. AI reviewer có thể miss critical business logic. Con người vẫn là last gate.

Measure trước khi áp dụng — Ghi lại baseline: thời gian PR review trung bình, số bug/sprint, MTTR khi có incident. Sau 1 tháng, so sánh. Nếu không thấy cải thiện, điều chỉnh cách dùng.

Kết

Fail-fast luôn là nguyên lý đúng. Cái thay đổi là bây giờ chúng ta có công cụ để đẩy điểm phát hiện từ "sprint 7 khi user complain" lên "sprint 1 khi PR được tạo".

Bug decimal precision tôi nhắc đầu bài — với AI gate đang chạy hiện tại, nó sẽ bị catch lúc code review. Không phải vì AI thông minh hơn developer, mà vì AI không mệt, không bận, không skip review vì deadline.

Team BKGlobal vẫn đang iterate. Bước tiếp theo chúng tôi đang thử: AI-powered sprint planning — phân tích complexity của task và warn sớm nếu scope có nguy cơ overrun. Nhưng đó là câu chuyện cho bài khác.

Anh em đang dùng AI ở điểm nào trong SDLC? Có tool hoặc approach nào hiệu quả mà team tôi chưa thử? Chia sẻ trong comment — tôi đọc hết.

Son Do — BKGlobal Tech Team

#BKGlobal #dotnet #architecture #1percentbetter