/var/log/DMIT-NOC.log
4.58K subscribers
187 photos
6 files
117 links
Download Telegram
DMIT has donated US$5,000 to the Hong Kong SAR to support relief efforts following the recent severe fires in Hong Kong. We hope this contribution could help affected residents navigate this difficult time.
GSL AS137409 severe packet loss; detour in temporary.
DMIT LAX Network Failure Analysis

At approximately 19:35:00 Pacific Time, DMIT deployed a change within the LAX metro to introduce IPv6 over MPLS and IS-IS for the access switches.

1. DMIT uses loopback addresses for iGP routing on all devices.
2. However, in the IPv6 RR configuration, we did not standardize the next-hop for IPv6 routes received from access switches, meaning Next-Hop was not changed to Peer-Address, it remain the final interface address.
3. Due to iBGP behavior, next-hop addresses will not not automatically converted to Peer-Address.
4. To prevent certain customers from using reserved IPv4/IPv6 addresses as Point-to-Point (PtP) interface address, DMIT's internal network does not propagate specific port addresses. (This made we have to change next-hop for all iBGP routes).
5. The above things make the edge router cannot resolve the acual next-hop to final interface.
5. When DMIT's border router fails to find a specific next-hop, it falls back to a Transit table.
6. On the Transit table, the route in FIB was programmed to the customer table.

These factors collectively caused IPv6 traffic originating from customers to continuously loop through multiple VRFs on a single router until the 128 TTL expired.

This ultimately exhausted backplane bandwidth, resulting in RR disconnections. When RR disconnected, custoemr routing was interrupted, loop traffic dropped, and the network recovered briefly before the looping failure recurred.

This configuration fault caused 3 minutes of downtime and a cumulative 13 minutes of degraded service. DMIT sincerely apologizes for this incident.
We have updated our TOS (Term Of Service)

1. TOS (Term Of Service) | Refund Policy

19.1.1 Full refunds are available in compliance with the rules below as well as 19.3.
1.The service is purchased for no more than 3 days;
1.The service is purchased for no more than 3 days (New order only);

19.2.1 Partial refunds are available in compliance with the rules below as well as 19.3.
1. The service is purchased for no more than 30 days;
1. The service is purchased for no more than 30 days (New order only);


19.3.1 We will not refund to the payment gateway under the following circumstances

1. The order is a renewal order/invoice, and the payment has been successfully processed; (Added)
2. The invoice was paid using existing account credit (refunds can only be returned to the account credit);
(Added)
3. The order/invoice is to add funds to the account credit;
(Added)

2. TOS (Term Of Service) | IP Replacement Policy

20.3 For Tier 1(Standard) network profile:
1. Without IP Guarantee+ Addon: DMIT do not give guarantee the IP is globally accessible for the new order, especially for China, Russia, and all country has national network censorship.
1. Monthly Billing Cycle: DMIT do not give guarantee the IP is globally accessible for the new order, especially for China, Russia, and all country has national network censorship.

2. With IP Guarantee+ Addon: Guaranteed first connection in sensitive areas.
2. Non-Monthly Billing Cycle: Guaranteed first connection in sensitive areas.
LAX Pro:
CN2 AS4809 DRT single point failure. Waiting CT to response.
CN2 misconfigured the interface policer in CSLA failover path lead to the congestion.
HKG:
Observed some host node experienced packet loss.
NOC is working on it
/var/log/DMIT-NOC.log
LAX Pro: CN2 AS4809 DRT single point failure. Waiting CT to response.
DMIT already esclated this issue to highest level of support.
As current information. CTG NOC didn't delivery the service as contract incl. specific BGP configuration and interface rates.
CTG HKG BGP session misconfiguration leads to session reset.
DMIT is esclating this issue with CT Group.

Please wait for the futher response.
Both CN2 LAX and HKG:

China Telecom didn't configure the BGP sessions and interfaces correctly.
The service auto-failover triggered a cascade of failures.

Plus, CTG NOC isn't helpful at all. The esclated NOC team isn't responding.
The HKG CTG CN2 session is recovered.
The routing will be restored soon.

We observed DDoS come in and targeted to 3 of our HKG.Pro subnet which might leads some customer experience packet loss.

The longer than expected downtime are caused by none reponse from CTGnet NOC.
DMIT work with CTG NOC since 9AM EST today. But the issue still not been fully resolved.

The service failure notice will be published once we secured all serivces.
Update:

LAX:
We've connected with CTG NOC and CT Group.
There will be an conference soon for this emergency.

HKG:
The DDoS is relieved. We are diagnosing for the issue on the mitigation facility. The DDoS were mostly filtered. But we observed there are some performance issue on the mitigation device. Some packet loss were caused by the mitigation devices instead of DDoS itselves.
We've used remote DDoS mitigation facilities and applied rule on the IP Transit. So far, it's stable, but we are still working on fix anything that not work as expected.
Update:

LAX:
IPv4 sessions are restored.

Remaining issue:
- Both interface is not well configured due to CTG internal mistake.
- The BGP session is mis configured due to the multi-party miscommunication. DMIT <> CTG <> CT-Group.

HKG:
The new type of DDoS is on-going. The filter is working as expected.
Mitigation facility repairing work is on track.

The report will be ready once everything is fully resolved.
As so far, the serivce is still within SLA.
DMIT Network Incident Report: LAX & HKG
This is the last update until there is another major event needs to be updated.

Here is the combined technical postmortem regarding the recent network instability.

🇺🇸 LAX CN2 GIA Incident
Current Status: All immediate mitigations applied. Final correction from CTG is pending due to the China-wide "Network Freeze" (ending Dec 15).

1. Root Cause: Prefix Limit Exceeded

The Mismatch: DMIT ordered a 1k prefix-limit, but the provider (CTG) left it at the default 300. This parameter is non-testable after service delivery, so we trusted the configuration.

The Trigger: Two clients increased route announcements + multiple DDoS RTBH routes pushed the count over 300.

The Result: AS4809 (CN2) immediately idled the BGP session upon exceeding the limit.

2. Why did failover result in packet loss?

Design: The redundant session (CoreSite) remained UP as designed (filtering DDoS routes to save prefix space).

The Critical Failure: Provider LACP Misconfiguration. CTG configured our link aggregation as a single interface capacity, ignoring our multiple physical 10G connections.

Impact: When traffic shifted to CoreSite, it exceeded the logical 10G cap, causing severe congestion and packet loss despite physical capacity being available.

3. Why the long recovery?

Administration: Due to the "Network Freeze," router CLI access is suspended.

Approval: CTA/CTG required emergency access approval from the Group level. Since it was after-hours in China, getting this authorization took significant time.
================
🇭🇰 HKG Incident
Current Status: 99.9% of traffic is successfully filtered. Active monitoring in place. 10Mpps ongoing.

1. Root Cause: "Carpet Bombing"

Attack Type: A massive Carpet Bombing attack targeted 3 specific subnets.

Vectors: Mixed volume of TCP-SYN, TCP-ACK (Zero/Empty), SYN-ACK, TCP Null, FIN, RST.

2. Why did mitigation fail initially?

The Leak: A combination of misconfigured detour rules and a hardware fault caused traffic to bypass local scrubbers. Malicious traffic entered directly via the backbone (LAX IP Transit).

The "Red Herring": We initially focused on refining rules, not realizing the mitigation equipment itself had a hardware/software fault. This misled our diagnosis and delayed the fix.

3. Resource Contention The concurrent critical failure in LAX required non-stop coordination, splitting our engineering resources and inevitably slowing down the HKG diagnosis.

🛡️ Future Prevention & Commitment
Stricter Auditing: We will implement an extra layer to manually review every text field on vendor orders to ensure delivered configurations (like Prefix Limits and LACP speeds) match our requirements perfectly.

The Reality: DDoS vectors evolve rapidly. While we cannot guarantee zero incidents, DMIT commits to using every resource to maintain stability and protect your business at reasonable costs.
================
Reimbursement: All services no matter location and network profile will have traffic reset on today, and everything an extra chance for free to reset the traffic before May 2026. (Deliver in the future by the website feature.)
DMIT observed multi-vector DDoS attack since yesterday towards to all our prefixes, including HKG, TYO and LAX.

You might experience some packet loss in sudden before the DDoS mitigation take effect.
From detection, detour to mitigation start, it might takes up to 1.5s.
We are applying a network changes to fast up the mitigation reponse time.

Internet Packet loss:
DMIT uses GSL as one of IP Transit (Not major) in Los Angeles.
But we heard there are some over Tbps attack towards to other GSL client today which made other GSL client (like DMIT) also impacted.

DMIT will keep monitoring and trying our best to serve your business.
HKG router config emergency maintenance in next hour.
May have short time detour, or blackout.


Done
📢 DMIT Payment System Upgrade & Security Enhancements

To provide a safer and more convenient payment experience, DMIT has officially implemented a new PCI-compliant Stripe payment gateway.

🛡️ Why the change? (PCI Security Standards) To meet strict security requirements:
1. All payment info must be processed via a PCI DSS compliant gateway.
2. No sensitive card data is stored locally.
3. Transmission and storage must be fully encrypted.

Our new Stripe integration meets all these standards. All credit card and alternative payment inputs are now handled directly within Stripe's secure environment.

What’s New?
- Enhanced Data Protection: Higher security standards for your peace of mind.
- New Payment Methods: We now support Apple Pay, Google Pay, and Cash App, alongside traditional credit cards.

⚠️ Important Notice Regarding Saved Cards As part of this transition to a secure architecture, we have removed all previously stored payment methods from Stripe and card basic info from WHMCS (we never stored sensitive card data). Please note that the old card management page is no longer available.

💳 How to Add Your New Payment Method

Method 1: During Checkout Select "Credit Card & Others" at checkout. You can add a new card or use one of the newly supported digital wallets. All info will be securely saved via the new gateway.

Method 2: Via Client Area You can add or update cards anytime via the new management interface:

Log in to your account.
- Go to "Billing".
- Select the "Manage Credit Card" tab.
🔗 Direct Link: https://www.dmit.io/index.php?m=stripe_next_cards

Thank you for your understanding and cooperation as we work to provide a more secure service experience.
/var/log/DMIT-NOC.log
📢 DMIT Payment System Upgrade & Security Enhancements To provide a safer and more convenient payment experience, DMIT has officially implemented a new PCI-compliant Stripe payment gateway. 🛡️ Why the change? (PCI Security Standards) To meet strict security…
We have sent emails to all DMIT customers who have used credit cards.

DMIT has completely discontinued the credit card processing method built into WHMCS and deleted all stored remote credit card data.

1. All historical stored credit card records have been permanently deleted, and DMIT has removed all database backups and MySQL Binlogs;
2. Your previously set up credit card auto-renewals are now invalid. Please add new credit card information via Stripe to avoid service interruption.
CTGnet reply due to NCP cable fault. The traffic between Tokyo and Shanghai is now bypassing Hong Kong.