Circuit Card Academy

Module 07

Digital Board Troubleshooting

Digital boards look intimidating — hundreds of identical-looking ICs — but they obey a brutally simple hierarchy: power → clock → reset → bus activity → logic. Faults at each level produce recognizable signatures. Work the hierarchy top-down and most "dead computer board" mysteries fall in minutes.

1. Logic levels — the vocabulary

A digital signal is a voltage interpreted as 1 or 0 against thresholds:

Family Supply Input reads LOW below Input reads HIGH above Notes
TTL / LVTTL 5V / 3.3V 0.8V 2.0V The classic thresholds
5V CMOS (HC etc.) 5V ~1.5V (0.3×Vdd) ~3.5V (0.7×Vdd) Tighter than TTL
3.3V LVCMOS 3.3V 0.8V 2.0V Ubiquitous now
Lower rails 2.5/1.8/1.2V proportional proportional Core rails of FPGAs/CPUs

The space between thresholds is undefined — a signal loitering there (e.g., 1.4V on a 3.3V input) is a defect signature in itself:

5V-tolerant vs not: driving 5V into a non-tolerant 3.3V input kills I/O — relevant when mixed-rail boards fail at interfaces.

2. The signal types you'll meet

3. The top-down ritual for a "dead" digital board

  1. Rails (06 — Troubleshooting Methodology §4): every rail present, in tolerance, clean. Modern boards have power sequencing requirements — rails must come up in order; a sequencer or enable daisy-chain that stalls leaves later rails at 0V with nothing "broken." Check enable pins of dead regulators.
  2. Clock: scope the oscillator output. Right frequency, healthy amplitude? No clock → oscillator power/enable → crystal and its load caps → replace crystal (mechanically fragile; prime suspect after drops/vibration).
  3. Reset: scope it through a power cycle (single-shot). Must release. Stuck low → supervisor IC, sagging rail (supervisor is doing its job), or a shorted reset net. Cycling repeatedly → watchdog: processor is crashing — go look at memory bus, core rails under load, firmware integrity.
  4. Activity: with power+clock+reset good, a working processor does things: bus bursts after release, chip selects strobing, status LEDs, UART chatter. Total silence with a good heartbeat = processor/firmware/memory — on a repair bench this is where boundary scan/functional ATE earns its keep, or where the diode-signature comparison against a golden board (04 — DMM Mastery §5) hunts dead I/O.

4. Digital failure signatures

Symptom Likely causes
Output stuck high or low regardless of input Dead driver stage in IC; shorted net (to rail or ground); input side never toggling — trace upstream first
Mid-level voltage (~half rail) on a push-pull line Contention (two drivers), or measuring a fast signal with a DMM (it averages! — scope it before declaring weirdness)
Runt pulses Contention, weak driver, cracked joint making intermittent contact
One bit of a bus dead, others fine Open trace/via/joint on that line, bent pin, ESD-killed pin
Adjacent pins shorted Solder bridge (rework history?), dendrite, conformal-coat-hidden whisker
Works cold, dies warm (or inverse) Cracked joint/via, marginal IC — freeze spray + heat gun to localize
I/O dead only on one connector ESD/overvoltage entered there — check transceivers and series protection (these interface parts are sacrificial by design; transceiver replacement is the most routine of digital repairs)
Random crashes/watchdog resets Rail ripple/sag under load, marginal clock, failing memory, intermittent joint

DMM on logic — know its limit: a DMM averages. A 50% duty 3.3V clock reads ~1.65V DC, which looks exactly like contention. Anything that might be toggling gets the scope, not the meter. The DMM's digital jobs are: rails, continuity of bus lines (power off), junction signatures, and stuck-at levels confirmed static by scope first.

5. ESD discipline is a digital-board survival rule

Modern CMOS dies at static levels you can't feel (damage threshold far below the ~3kV human perception threshold). Worse than instant death is the walking wounded part: ESD-degraded, passes today, fails on the aircraft. This is why aerospace ESD rules are absolute, not ceremonial — wrist strap, dissipative mat, grounded iron, parts in shielded bags until installation. Full discipline in 10 — Aerospace Standards, ESD, and Workmanship.

6. Repairing around big silicon

You will rarely "fix" a processor — you'll prove the environment around it (rails, clock, reset, interfaces) good, prove its connections good (boundary scan / signature comparison), and replace it only when it's the last suspect standing. BGA replacement is specialist rework (hot-air/IR station, profiles, X-ray verification) — your shop will have a process and a designated station; your diagnostic job is to be sure before that expensive step.

7. Self-check

  1. Recite the hierarchy.
  2. A 3.3V line reads 1.6V on the DMM. Name the two very different explanations and the tool that separates them.
  3. Reset pulses low every 1.6 seconds.
  4. I²C bus: SDA permanently low.
  5. Why does ATE use boundary scan on BGA boards?

Next: 08 — Analog Board Troubleshooting