Knowledge Gaps
What’s NOT documented: power-on timing, voltage stability, error recovery, bus sharing conflicts, and platform integration.
What Are Knowledge Gaps?
Knowledge Gaps are critical dependencies, requirements, or procedures that:
- Exist in reality (engineers know them, or discover them through failure)
- Are NOT documented in any system
- Can cause mission failures if unknown
- Are at risk of loss during team turnover
This analysis found 5 major knowledge gaps in the F´ + PROVES Kit integration.
Gap 1: Power-On Timing Requirements
What You’re Looking At
This sequence diagram shows the power-on process for an I2C device, with all the timing steps highlighted in red because they’re NOT documented anywhere. When you turn on power, the voltage takes time to rise, the device takes time to reset itself, and it takes more time to be ready for communication. Without knowing these delays, you’ll try to talk to the device before it’s ready.
Think of it like: Calling someone right after they wake up. You need time for them to (1) open their eyes (voltage rise), (2) remember who they are (power-on reset), and (3) get their brain working (ready for conversation). Call too early and you’ll just get confused mumbling.
The Missing Specification
sequenceDiagram
participant LSM as LoadSwitchManager
participant Power as Power Supply
participant Device as I2C Device
participant Driver as I2C Bus Driver
LSM->>Power: turn_on("imu")
Power->>Power: GPIO goes HIGH
Power->>Device: Voltage ramps up
Note over Power,Device: ❌ GAP: How long does this take?
rect rgb(255, 200, 200)
Power-->>Device: t_rise = ??? ms
Device-->>Device: Internal power-on reset
Note over Device: ❌ GAP: How long for POR?
Device-->>Device: t_por = ??? ms
Device-->>Device: Initialize registers
Note over Device: ❌ GAP: Ready delay?
Device-->>Device: t_ready = ??? ms
end
Note over Device,Driver: ❌ GAP: Total delay unknown
Driver->>Device: I2C address probe
alt Device Ready
Device-->>Driver: ACK
else Device Not Ready
Device-->>Driver: No response (bus timeout)
end
What’s NOT Documented
| Parameter | F´ Docs | PROVES Docs | Typical Value | Impact if Unknown |
|---|---|---|---|---|
| t_rise - Voltage rise time | ❌ | ❌ | 1-10ms | Race condition |
| t_por - Power-on reset duration | ❌ | ❌ | 10-100ms | Device not initialized |
| t_ready - Ready after POR | ❌ | ❌ | 1-50ms | I2C communication fails |
| t_total - Safe delay before I2C | ❌ | ❌ | 50-200ms | Intermittent failures |
Where This Knowledge Lives
Currently:
- 🧠 In experienced engineers’ heads
- 📄 Maybe in MPU6050 datasheet (not referenced in either doc)
- 🐛 Discovered through debugging after failures
- 📧 Discussed in email threads (not captured)
Risk: When Team A graduates, this knowledge is LOST.
Real-World Impact
%%{init: {'flowchart': {'defaultRenderer': 'elk'}}}%%
flowchart TB
START[New developer integrates IMU]
CASE1{Does developer<br/>add delay?}
CASE2{What delay<br/>value?}
CASE3{Test coverage?}
TOO_SHORT[Delay too short<br/>50ms]
WORKS_BENCH[✅ Works on bench<br/>warm start]
FAILS_FLIGHT[❌ Fails in flight<br/>cold start slower]
NO_DELAY[No delay added]
WORKS_LINUX[✅ Works on Linux<br/>scheduler slow enough]
FAILS_RTOS[❌ Fails on RTOS<br/>too fast]
CORRECT[Delay adequate<br/>200ms]
SUCCESS[✅ Always works]
START --> CASE1
CASE1 -->|Yes| CASE2
CASE1 -->|No| NO_DELAY
CASE2 -->|Guesses 50ms| TOO_SHORT
CASE2 -->|Finds spec: 200ms| CORRECT
TOO_SHORT --> CASE3
CASE3 -->|Only bench test| WORKS_BENCH
WORKS_BENCH -.->|Ships to orbit| FAILS_FLIGHT
NO_DELAY --> WORKS_LINUX
WORKS_LINUX -.->|Ports to embedded| FAILS_RTOS
CORRECT --> SUCCESS
style FAILS_FLIGHT fill:#ffcdd2
style FAILS_RTOS fill:#ffcdd2
style SUCCESS fill:#c8e6c9
Probability: 70% of developers will get this wrong without documentation.
Why This Matters: This is the EXACT failure mode from the Team A/Team B scenario. Team A knew the 200ms delay was needed (through trial and error). Team B saw it, thought “that’s too slow,” changed it to 10ms, tested on a warm system (worked), then failed in orbit on a cold boot. All because the timing requirement wasn’t documented.
Gap 2: Voltage Stability Requirements
What You’re Looking At
This diagram shows electrical characteristics that software developers never think about but absolutely matter for I2C communication. Every component has voltage requirements (how clean the power needs to be, how much it can drop, etc.). The diagram shows all the parameters with “???” because they’re not in the software documentation.
Think of it like: Trying to have a phone conversation with a bad connection. If the signal drops too low (voltage dropout), gets too noisy (ripple), or cuts out briefly (current spike), you’ll miss words or get static. Your phone app doesn’t tell you “you need -85dBm signal strength” but that knowledge matters.
The Missing Specification
%%{init: {'flowchart': {'defaultRenderer': 'elk'}}}%%
flowchart TB
subgraph "Power Supply Characteristics"
V_NOM[Nominal Voltage<br/>3.3V]
V_RIPPLE[Ripple: ??? mV]
V_DROPOUT[Dropout: ??? mV]
I_SPIKE[Current spike: ??? mA]
end
subgraph "I2C Bus Requirements"
V_IH[V_IH: Input High<br/>??? V minimum]
V_IL[V_IL: Input Low<br/>??? V maximum]
V_MARGIN[Noise Margin<br/>???]
end
subgraph "Load Switch Characteristics"
R_ON[R_ON: ??? mΩ]
I_MAX[I_MAX: ??? mA]
T_SWITCH[Switch time: ??? μs]
end
V_NOM -.->|minus dropout| V_DROPOUT
V_DROPOUT -.->|must exceed| V_IH
I_SPIKE -.->|causes drop: I × R_ON| R_ON
V_RIPPLE -.->|must be less than| V_MARGIN
style V_RIPPLE fill:#ffebee
style V_DROPOUT fill:#ffebee
style I_SPIKE fill:#ffebee
style V_IH fill:#fff9c4
style V_MARGIN fill:#fff9c4
style R_ON fill:#e1f5ff
What’s NOT Documented
| Parameter | Required For | F´ Docs | PROVES Docs | Impact |
|---|---|---|---|---|
| V_ripple | Clean I2C signals | ❌ | ❌ | Bit errors |
| V_dropout | Load regulation | ❌ | ❌ | Brownout |
| I_spike | Inrush current | ❌ | ❌ | Voltage sag |
| R_ON | Switch resistance | ❌ | ❌ | Power loss |
| V_IH, V_IL | I2C thresholds | ❌ | ❌ | Communication errors |
Where This Knowledge Lives
Currently:
- 📊 Hardware schematics (not linked to software docs)
- 🔬 Oscilloscope measurements during debugging
- 🏭 Component datasheets (MPU6050, load switch IC, regulators)
- 👥 Hardware engineer tribal knowledge
Risk: Software developers don’t know to check these parameters.
Failure Mode
Scenario: High power draw during camera operation
↓
3.3V rail sags to 3.1V (within regulator spec)
↓
I2C V_IH threshold is 0.7 × Vdd = 0.7 × 3.1V = 2.17V
↓
Signal integrity marginal
↓
I2C bus has intermittent bit errors
↓
IMU read returns corrupted data
↓
Attitude determination fails
↓
Mission loss
Time to debug: Days to weeks (requires oscilloscope, experienced hardware engineer)
Key Insight: This gap exists because hardware knowledge and software knowledge live in different teams and different documents. The hardware team knows the voltage requirements (they designed the circuit), but the software team doesn’t have access to that information. This is an organizational problem, not a technical one.
Gap 3: Error Recovery Strategies
What You’re Looking At
This state diagram shows a decision tree for error recovery that SHOULD exist but doesn’t. When an I2C error happens, the system needs to decide: Is this recoverable? Should we retry? Should we power cycle? How many times? The diagram shows that currently, systems just log the error and give up (red path at bottom), when they should follow the recovery decision tree (red-boxed section).
Think of it like: When you drop your phone and the screen freezes, you could either (1) declare it broken and buy a new one, or (2) try turning it off and on again first. Most software does option 1 because nobody documented that option 2 exists and works 90% of the time.
The Missing Integration
stateDiagram-v2
[*] --> Normal: System boot
Normal --> I2C_Error: I2C read fails
I2C_Error --> Log_Warning: F´ logs event
state "KNOWLEDGE GAP" as GAP {
Log_Warning --> Should_Power_Cycle: Decision point
Should_Power_Cycle --> Power_Off: Yes
Should_Power_Cycle --> Give_Up: No
Power_Off --> Wait_Discharge
Wait_Discharge --> Power_On
Power_On --> Wait_Stabilize
Wait_Stabilize --> Retry_I2C
Retry_I2C --> Normal: Success
Retry_I2C --> Try_Again: Fail (retry < max)
Try_Again --> Power_Off
Retry_I2C --> Give_Up: Fail (retry >= max)
}
Give_Up --> Degraded_Mode: Continue without IMU
Log_Warning --> [*]: Currently: No recovery implemented
style GAP fill:#ffebee
style Log_Warning fill:#fff9c4
style Give_Up fill:#ffcdd2
style Normal fill:#c8e6c9
What’s NOT Documented
| Decision Point | Question | F´ Docs | PROVES Docs | Current Reality |
|---|---|---|---|---|
| Error Detection | Which errors are recoverable? | Logs error | N/A | Unknown |
| Recovery Strategy | Should power cycle on I2C error? | ❌ | ❌ | No recovery |
| Retry Count | How many retries before giving up? | ❌ | ❌ | Give up immediately |
| Timing | How long to wait after power cycle? | ❌ | ❌ | N/A |
| Escalation | When to alert operator? | Logs event | N/A | Every error (noisy) |
Missing Decision Tree
No documentation exists for:
IF I2cStatus == I2C_READ_ERR:
IF consecutive_errors < 3:
# Try simple retry
WAIT 10ms
RETRY read()
ELSE IF consecutive_errors < 10:
# Power cycle recovery
LoadSwitchManager.turn_off("imu")
WAIT 500ms # Capacitor discharge
LoadSwitchManager.turn_on("imu")
WAIT 200ms # Power stabilization
RETRY read()
ELSE:
# Permanent failure
LOG CRITICAL "IMU unrecoverable"
ENTER degraded_mode
ALERT operator
This entire decision tree is UNDOCUMENTED.
Where This Knowledge Lives
Currently:
- 🔬 Discovered through mission operations
- 📝 Procedures written after first failure
- 🧠 Operator tribal knowledge
- ✉️ Communicated verbally between shifts
Risk: Each new mission team rediscovers this through failures.
Why This Matters: Without documented recovery strategies, every team invents their own (or doesn’t bother). This means inconsistent behavior across missions and lost opportunities for automatic recovery. One team might have a sensor permanently fail while another team’s system auto-recovers—just because of undocumented tribal knowledge.
Gap 4: Bus Sharing and Conflicts
What You’re Looking At
This diagram shows an I2C bus topology where multiple devices share the same communication bus, but nobody documented which devices are where or what addresses they use. The dashed lines represent “unknown” connections. Without knowing the full picture, you might accidentally put two devices at the same address or try to power them all on simultaneously (overloading the power supply).
Think of it like: A party line telephone (old tech, look it up!). Multiple people share one phone line, so you need to know (1) who else is on the line, (2) their “ring codes” (addresses), and (3) not to call everyone at once. Without documentation, you’ll accidentally call the wrong person or interrupt someone else’s call.
The Missing Architecture
%%{init: {'flowchart': {'defaultRenderer': 'elk'}}}%%
flowchart TB
subgraph "I2C Bus Topology (UNDOCUMENTED)"
BUS[I2C Bus /dev/i2c-1<br/>SDA/SCL]
DEV1[Device 1: IMU<br/>Addr: 0x68]
DEV2[Device 2: Magnetometer<br/>Addr: ???]
DEV3[Device 3: Camera<br/>Addr: ???]
DEV4[Device 4: ???<br/>Addr: ???]
end
subgraph "Power Control"
LSM1[IMU_ENABLE]
LSM2[MAG_ENABLE]
LSM3[CAM_ENABLE]
end
subgraph "Questions"
Q1[Are devices on<br/>same bus?]
Q2[Can addresses<br/>conflict?]
Q3[Power-on<br/>sequence?]
Q4[Bus arbitra-<br/>tion?]
end
BUS -.->|unknown| DEV1
BUS -.->|unknown| DEV2
BUS -.->|unknown| DEV3
BUS -.->|unknown| DEV4
LSM1 --> DEV1
LSM2 --> DEV2
LSM3 --> DEV3
DEV1 -.-> Q1
DEV2 -.-> Q2
LSM1 -.-> Q3
BUS -.-> Q4
style BUS fill:#ffebee
style Q1 fill:#fff9c4
style Q2 fill:#fff9c4
style Q3 fill:#fff9c4
style Q4 fill:#fff9c4
What’s NOT Documented
| Aspect | Information Needed | F´ Docs | PROVES Docs | Impact if Unknown |
|---|---|---|---|---|
| Bus Topology | Which devices on which bus? | ❌ | ❌ | Wrong bus configured |
| Address Map | All I2C addresses | Partial (0x68) | ❌ | Address conflicts |
| Power Sequence | Order to enable devices | ❌ | ❌ | Bus contention |
| Simultaneity | Can devices operate together? | ❌ | ❌ | Data corruption |
| Priority | Which device has priority? | ❌ | ❌ | Starvation |
Conflict Scenarios
sequenceDiagram
participant App as Application
participant IMU as IMU Manager
participant Mag as Magnetometer Manager
participant Bus as I2C Bus
participant HW_IMU as IMU Hardware
participant HW_MAG as MAG Hardware
Note over App,HW_MAG: Scenario 1: Address Conflict (UNDOCUMENTED)
App->>IMU: Read IMU data
App->>Mag: Read MAG data
par Simultaneous I2C Transactions
IMU->>Bus: Start transaction to 0x68
and
Mag->>Bus: Start transaction to 0x68 (conflict!)
end
Bus-->>IMU: Data corrupted
Bus-->>Mag: Data corrupted
Note over App,HW_MAG: ❌ Both reads fail, no indication why
Note over App,HW_MAG: Scenario 2: Power-On Glitch (UNDOCUMENTED)
App->>IMU: turn_on IMU
App->>Mag: turn_on MAG
par Simultaneous Power-On
HW_IMU->>HW_IMU: Inrush current spike
and
HW_MAG->>HW_MAG: Inrush current spike
end
Note over HW_IMU,HW_MAG: Combined current exceeds<br/>load switch rating
HW_IMU--XHW_IMU: Brownout / latchup
HW_MAG--XHW_MAG: Brownout / latchup
Note over App,HW_MAG: ❌ Devices damaged, mission loss
Where This Knowledge Lives
Currently:
- 📐 Hardware schematics (separate from software docs)
- 🔍 Reverse-engineered from board layout
- 🧪 Discovered during integration testing
- 🚨 Learned from failures
Risk: Software developers don’t have access to hardware documentation.
Key Insight: This is another hardware/software knowledge gap. The hardware team drew schematics showing the bus topology, but the software team is writing I2C drivers without seeing those schematics. Both teams have half the picture, neither has the complete view.
Gap 5: Platform-Specific Integration
What You’re Looking At
This diagram shows how F´ supports multiple platforms (Linux, Zephyr, bare metal) and PROVES Kit supports multiple languages (CircuitPython, C), but there’s no documentation on how to combine them. Each box is a valid configuration, but the arrows with “how to combine?” show that the integration patterns are undocumented.
Think of it like: You have IKEA furniture (F´) and tools from Home Depot (PROVES Kit). Both are good products with instructions, but there’s no guide for “how to use Home Depot tools to assemble IKEA furniture.” Every team figures it out themselves, often differently.
The Missing Cross-Platform Guide
%%{init: {'flowchart': {'defaultRenderer': 'elk'}}}%%
flowchart LR
subgraph "F´ Framework"
F_LINUX[LinuxI2cDriver<br/>Linux]
F_ZEPHYR[ZephyrI2cDriver<br/>Zephyr RTOS]
F_BAREMETAL[Custom Driver<br/>Bare Metal]
end
subgraph "PROVES Kit"
P_CIRCUITPY[CircuitPython<br/>board.IMU_ENABLE]
P_MICROPYTHON[MicroPython<br/>???]
P_C[C/C++<br/>???]
end
subgraph "Integration Patterns"
INT1[F´ Linux +<br/>PROVES CircuitPython]
INT2[F´ Zephyr +<br/>PROVES C]
INT3[F´ Bare Metal +<br/>???]
end
F_LINUX -.->|how to combine?| P_CIRCUITPY
F_ZEPHYR -.->|how to combine?| P_C
F_BAREMETAL -.->|how to combine?| P_C
F_LINUX -.-> INT1
P_CIRCUITPY -.-> INT1
F_ZEPHYR -.-> INT2
P_C -.-> INT2
style INT1 fill:#ffebee
style INT2 fill:#ffebee
style INT3 fill:#ffebee
What’s NOT Documented
| Integration | F´ Platform | PROVES Platform | Documented? | Challenge |
|---|---|---|---|---|
| Desktop Sim | Linux + Python | CircuitPython sim | ❌ | How to mock hardware? |
| Flight Target | Zephyr RTOS + C++ | C + registers | ❌ | How to share GPIO? |
| Lab Test | Linux + Python | Hardware board | ❌ | How to communicate? |
Missing Integration Examples
No documentation exists for:
- How F´ C++ calls PROVES CircuitPython:
// ❌ NOT DOCUMENTED // In F´ configureTopology(): void configureTopology() { // How to call Python LoadSwitchManager from C++? // - Embed Python interpreter? // - Use IPC (sockets, shared memory)? // - Compile PROVES to C extension? // - Use external process + protocol? } - How to share GPIO control:
```
❌ NOT DOCUMENTED
- Does F´ control GPIO directly?
- Does PROVES control GPIO and F´ requests power?
- Is there a hardware abstraction layer?
- Who owns the GPIO driver? ```
- Build system integration:
# ❌ NOT DOCUMENTED # How to build F´ + PROVES together? # - Separate processes? # - Linked libraries? # - Microservice architecture?
Where This Knowledge Lives
Currently:
- 🔨 Each mission team invents their own integration
- 🎭 Architecture decisions not documented
- 📞 Communicated through private channels
- 🔄 Reinvented for each new mission
Risk: No standard integration pattern, constant rework.
Why This Matters: Every mission team is reinventing the wheel. One team builds F´+PROVES as separate processes communicating over sockets. Another compiles PROVES to C and links it with F´. A third uses Python embedding. Without a documented pattern, teams waste months on integration instead of working on their actual mission.
Summary: Knowledge Gap Impact
Gap Distribution
%%{init: {'theme':'base', 'themeVariables': { 'pie1':'#ff6b6b', 'pie2':'#ffa500', 'pie3':'#ffd700', 'pie4':'#ff8c42', 'pie5':'#ffb347'}}}%%
pie title Knowledge Gaps by Category
"Timing Specifications" : 3
"Hardware Parameters" : 5
"Software Integration" : 4
"Error Handling" : 2
"Platform Specifics" : 3
Risk Matrix
| Gap | Probability of Occurrence | Severity if Unknown | Overall Risk |
|---|---|---|---|
| Power-On Timing | 70% | Critical | 🔴 EXTREME |
| Voltage Stability | 40% | Critical | 🔴 HIGH |
| Error Recovery | 90% | Medium | 🟡 HIGH |
| Bus Conflicts | 30% | High | 🟡 MEDIUM |
| Platform Integration | 60% | Medium | 🟡 MEDIUM |
Time to Discover
gantt
title Typical Discovery Timeline for Knowledge Gaps
dateFormat YYYY-MM-DD
section Design Phase
Integration planning :2024-01-01, 7d
section Development
Code implementation :2024-01-08, 14d
section Testing
Bench testing :2024-01-22, 7d
Discovery: Timing gap :milestone, 2024-01-26, 0d
section Integration
System integration :2024-01-29, 14d
Discovery: Bus conflict:milestone, 2024-02-05, 0d
section Flight Prep
Environmental testing :2024-02-12, 21d
Discovery: Voltage gap :milestone, 2024-02-28, 0d
section Operations
Launch and operations :2024-03-05, 7d
Discovery: Error handling gap :crit, 2024-03-08, 0d
Average Discovery Time: 45-60 days after project start
Cost of Late Discovery: Exponential
- Design phase: 1× cost to fix
- Development: 10× cost to fix
- Testing: 100× cost to fix
- Flight: Mission loss
Recommendations
Immediate Actions
- Create Integration Guide
- Document all 5 knowledge gaps
- Provide specifications for timing, voltage, errors
- Include decision trees for error recovery
- Specify platform integration patterns
- Extract from Tribal Knowledge
- Interview experienced engineers
- Document undocumented procedures
- Capture failure lessons learned
- Create searchable knowledge base
- Link Hardware to Software Docs
- Cross-reference schematics
- Include component datasheets
- Document pin mappings
- Specify electrical characteristics
Long-Term Solutions
- Automated Gap Detection
- Scan documentation for missing specifications
- Flag undefined timing requirements
- Detect undocumented integrations
- Alert on platform-specific gaps
- Empirical Capture System
- Log all mission failures
- Extract knowledge from debugging sessions
- Capture workarounds and fixes
- Build searchable failure database
- Continuous Knowledge Review
- Regular documentation audits
- Cross-team knowledge sharing sessions
- Mandatory post-mission reports
- Knowledge preservation before team turnover
Navigation
Analysis Method: Negative space analysis, gap identification Gaps Found: 5 major categories, 17 specific missing items Estimated Risk: 🔴 EXTREME (multiple critical gaps) Date: December 20, 2024