Chair Falls Risk Analysis · March 2026

Exposure-Normalized Bed and Chair Fall Rates via Continuous AI Monitoring

Retrospective cohort · AI monitoring · 10 regional hospitals · Aug 2024 – Dec 2025

Gabriel · Rehani · Drumm · Troy · Wyatt · Singh run: consensus_v3_outcome_20260324T014635Z

2.35×

Adjusted Chair-vs-Bed Rate Ratio

95% CI 0.87–6.33 · p=0.0907

17.8

Chair Falls per 1,000 Exposure-Hours

vs. 4.3 for bed-bound · 4.1× descriptive

Linked Fall Events in Analysis Base

of 43 adjudicated · 292,914 analysis rows

6/7

Direct Chair Falls: Footrest/Positioning

86% in mechanism observation cohort

Core Findings

Exposure-Normalized Fall Rates

Falls per 1,000 exposure-hours · probability-weighted

17.8

Chair-seated

4.3

Bed-bound

Hard-label rates: chair 15.6, bed 4.5 /1,000 h (shown for transparency). Total exposure: 320.5 chair-hours · 5,121.4 bed-hours.

Fall Mechanisms (Observation Cohort)

32 deduplicated events · not a population denominator

Other / unclassified

56%

Transfer failure

25%

Footrest/positioning

19%

6 of 7 direct-chair events (86%) classified as footrest/positioning failure.

Furniture-Origin Chain & Post-Departure Latency

Observation cohort · n=32 events · expands chair-associated risk beyond direct falls

Chain classification: 12 direct bed · 10 bed-origin room · 7 direct chair · 3 chair-origin room.

Including chair-origin room events expands chair-associated count from 7 → 10. Reclassification sensitivity yielded same RR (2.35), indicating no material change to modeled chair/bed allocation in this run.

Post-departure latency: median 18 s for bed-origin (IQR 6–69; n=10) and median 50 s for chair-origin events (n=3).

This short transition window represents a clinically actionable monitoring target.

Adjusted Modeling — Sensitivity Analysis Summary

Poisson GLM · log-link · offset=log(exposure_hours) · adjusted for time-of-day, day-of-week, quarter, site

Analysis	RR	95% CI	p-value	Events
Primary (all hours) ★	2.35	0.87–6.33	0.0907	40
Primary (HC3 robust SE)	2.35	0.93–5.94	0.0709	40
Primary (clustered by division) †	2.35	1.89–2.92	<0.0001	40
Eligibility threshold ≥12 h	2.39	0.88–6.47	0.0859	38
Eligibility threshold ≥24 h	2.43	0.89–6.63	0.0831	36
Chair-origin reclassified	2.35	0.87–6.33	0.0907	40
Position-certainty, time-window, weekday/weekend	Insufficient data — not estimable (pre-specified gate)

★ Primary result. † Exploratory — 9 intervention divisions, underdispersed (deviance/df=0.13). Primary + HC3 CIs cross 1.0. Misclassification scenarios: RR range 4.49–13.17 across estimable swaps.

Why This Matters

The Measurement Gap

Standard fall metrics (events per 1,000 bed-days) conflate all patient positions. AI monitoring enables posture-specific denominators — the first time this comparison is possible at scale.

Modifiable Target Identified

Among the 7 direct-chair mechanism-coded events, 6 carried a footrest/positioning tag. This supports targeted chair-setup confirmation as a hypothesis to test prospectively — not a proven intervention.

Important Caveats

Hypothesis-generating only. Single health system, observational design, position classifier macro F1=0.528, unmeasured confounders (acuity, staffing, mobility). Prospective validation required.

Data Quality Gates

All three pre-specified readiness gates cleared prior to inferential reporting.

Gate 1: Data Extraction Validated

Gate 2: Label QA (report-only thresholds)

Gate 3: De-identification Cleared

Gate 2 note: macro F1=0.528 · ECE=0.450 · detection F1=0.846 (precision 1.000 / recall 0.733) · latency MAE=37.9 s. Label quality remains a material limitation; inferential sharing requires stakeholder sign-off.

Recommended Next Steps

01 / NEAR-TERM

Chair-Setup QA Hypothesis

Frame footrest confirmation and call-light accessibility as a prospective QA pilot — not a policy change.

02 / VALIDATION

Multi-Site Prospective Study

Larger chair-exposure cohort, improved position labels, clinical confounders (acuity, mobility, staffing).

03 / PLATFORM

Classifier Improvement

Macro F1=0.528 still limits causal interpretation. Better labels prerequisite to confident multi-site claims.

04 / FRAMING

Nursing-Led Integration

Co-design any monitoring pathway with nursing staff. Technology-first framing is a known failure mode.

Data Dashboard · consensus_v3_outcome_20260324T014635Z

Chair Falls Risk — Analytic Summary

Synced to submitted arXiv bundle · run_id: consensus_v3_outcome_20260324T014635Z · March 2026

Gabriel · Rehani · Drumm · Troy · Wyatt · Singh · LookDeep Health

2.35×

Adjusted Chair-vs-Bed RR

95% CI 0.87–6.33 · p=0.0907

17.8

Chair Falls / 1,000 Exposure-Hrs

vs. 4.3 bed · 4.1× unadjusted

43 / 40

Study-Window Events

43 matched to pipeline · 40 inferential

6 / 7

Chair Falls: Footrest Tag

86% in mechanism obs. cohort

Exposure & Fall Rates

Fall Rates by Position

Falls per 1,000 exposure-hours · probability-weighted

Exposure Hours Distribution

Intervention-eligible units · 5,441.9 total hours

GT Pre-Fall Location

Adjudicated ground-truth labels · n=85 events with location

Adjusted Modeling & Mechanisms

Sensitivity Analysis Forest Plot

Poisson GLM · log-link · offset=log(exposure_hours)

Mechanism Taxonomy

Observation cohort · n=32 events

Furniture-Origin Chain

Obs. cohort n=32 · 7→10 chair-assoc. events incl. room

Label Quality & Validation

Label Evaluation Metrics

v3 adjudicated benchmark · 30 truth rows / 31 seqs

Post-Departure Latency

Time from furniture departure to fall · events with evaluable timestamps

LLM Benchmark (Ancillary)

3-way overlap subset · n=42 visible primary rows · ancillary only

Cohort Reference Tables

Cohort Summary (Table 1)

Parameter	Value
Study period	Aug 2024 – Dec 2025
Total monitors (hourly)	5,531
Eligible monitors	3,980 / 5,531 (72.0%)
Intervention-eligible	42
Control-eligible	3,938
Analysis base rows	292,914
Adj. events (study window)	43 matched · 40 linked
Chair exposure	320.51 hrs
Bed exposure	5,121.42 hrs
Broader feed (descriptive)	91 events (2022–2026)
Obs. cohort (mechanism)	32 dedup. events
Benchmark subset	30 truth rows / 31 seqs
Eligibility gates	min_hrs=4 · min_cov=0.95

Unadjusted Rates & Sensitivity (Tables 2–3)

Analysis	RR	95% CI	p	n
Primary ★	2.35	0.87–6.33	0.091	40
HC3 robust	2.35	0.93–5.94	0.071	40
Clustered †	2.35	1.89–2.92	<.001	40
Chair-origin reclassified	2.35	0.87–6.33	0.091	40
Elig. ≥12h	2.39	0.88–6.47	0.086	38
Elig. ≥24h	2.43	0.89–6.63	0.083	36
Elig. ≥48h	2.42	0.84–6.98	0.101	30
Pos.-certainty / time-window / weekday/wkend: NE

★ Primary. † Exploratory, 9 divisions. Misclassification scenarios (estimable): RR 4.49–13.17.

Unadjusted rates: Chair 5 HL events / 320.51 hrs = 15.6/1k (prob-weighted 17.8). Bed 23 HL events / 5,121.42 hrs = 4.5/1k (prob-weighted 4.3).

arXiv-aligned dashboard · run_id: consensus_v3_outcome_20260324T014635Z · March 2026 · P. Gabriel, P. Rehani, Z. Drumm, T. Troy, T. Wyatt, N. Singh

Original Research · arXiv:2603.22785 [cs.CV] · 24 Mar 2026

Exposure-Normalized Bed and Chair Fall Rates via Continuous AI Monitoring

Paolo Gabriel · Peter Rehani · Zack Drumm · Tyler Troy · Tiffany Wyatt · Narinder Singh

LookDeep Health

March 2026 · Synced to submitted arXiv bundle (upload 2026-03-23)

Patient PositioningAccidental Falls Video RecordingHospital Units Poisson RegressionRetrospective Cohort

2.35

95% CI 0.87–6.33 · p=0.0907

Adjusted RR (chair/bed)

17.8

per 1,000 chair-hrs

Chair fall rate

4.3

per 1,000 bed-hrs

Bed fall rate

3,980

eligible monitors

Denominator cohort

of 43 matched

Inferential events

6/7

direct chair falls

Footrest tag (obs. cohort)

Abstract

This retrospective cohort study used continuous AI monitoring to estimate fall rates by exposure time rather than occupied bed-days. From August 2024 to December 2025, 3,980 eligible monitoring units contributed 292,914 hourly rows, yielding probability-weighted rates of 17.8 falls per 1,000 chair exposure-hours and 4.3 per 1,000 bed exposure-hours. Within the study window, 43 adjudicated falls matched the monitoring pipeline, and 40 linked to eligible exposure hours for the primary Poisson model, producing an adjusted chair-versus-bed rate ratio of 2.35 (95% confidence interval 0.87 to 6.33; p=0.0907). In a separate broader observation cohort (n=32 deduplicated events), 6 of 7 direct chair falls involved footrest-positioning failures. Because this was an observational study in a single health system, these findings remain hypothesis-generating and support testing safer chair setups rather than using chairs less.

Key Messages

Chair-seated patients had a probability-weighted fall rate of 17.8 per 1,000 exposure-hours vs. 4.3 for bed-bound patients in this dataset.
The adjusted chair-vs-bed RR was 2.35 (95% CI 0.87–6.33), elevated but imprecise; the interval crosses 1.0.
Six of 7 mechanism-coded direct-chair falls carried a footrest/positioning tag, pointing to a plausible modifiable prevention target.
Findings are hypothesis-generating. Single health system, observational design, unmeasured confounders, and classifier limitations (macro F1=0.528) preclude causal conclusions.

Introduction

Inpatient falls represent one of the most prevalent and costly preventable adverse events in acute care. Contemporary U.S. and international reports continue to show substantial event burden and injury risk across inpatient settings. Falls prevention remains a national accreditation and patient-safety priority, with The Joint Commission and national safety organizations continuing to emphasize this domain as a persistent quality challenge. Global guidance similarly identifies hospital falls prevention as an ongoing systems-level priority.

A central measurement gap in the inpatient falls literature is denominator choice. Most hospital fall metrics are reported per 1,000 occupied bed-days — a denominator that merges bed time, chair time, transfers, and room movement into a single exposure bucket. When a patient falls from a chair, the event is still counted against a bed-day denominator, which can mask position-specific hazard and dilute clinically actionable signal about chair setup and supervision.

Continuous AI-based patient monitoring systems create an opportunity to replace that blended denominator with position-specific exposure time. By assigning probabilistic chair and bed positions at sub-minute resolution, these systems can estimate how many falls occur per hour of chair-seated time versus per hour of bed-bound time within the same monitored population. This denominator-first approach aligns with standard rate modeling and supports adjusted comparisons that routine incident-reporting systems cannot produce.

This retrospective cohort analysis uses continuous AI monitoring data to estimate position-specific exposure-normalized fall rates and then test whether the observed descriptive separation (17.8 vs 4.3 falls per 1,000 exposure-hours) persists after adjustment in a Poisson rate model. The aim is not to argue for less chair use or to promote bed confinement, given the recognized harms of low mobility in hospitalized older adults. Rather, the aim is to determine whether chair-positioned time concentrates fall opportunity enough to justify safer chair setup, clearer transfer workflows, and prospective denominator-aware monitoring studies that integrate mobility promotion with targeted fall prevention.

Methods

Study Design and Setting

Retrospective observational cohort study using continuous AI monitoring data from a single regional health system, analyzed at the division level. Study period: August 2024–December 2025. Conducted under an existing data use agreement between LookDeep Health and the health system as a secondary analysis of de-identified monitoring records generated during routine clinical operations.

Data Source

The AI monitoring platform continuously processes room-level video feeds and assigns probabilistic position estimates for each monitored patient. Per-hour position fractions (pct_chair, pct_bed, pct_ambulatory) were extracted from the hourly monitoring cache using a validated study-window filter and percentage-sum constraint. The extraction yielded 356,391 hourly rows with a valid position-percentage sum of 100% across all rows. Fall events were identified from AI-detected alarm records and linked to the hourly exposure structure via monitor and date-hour keys.

Cohort Definition and Eligibility

A monitoring unit was defined as a unique monitor for cohort membership. Units were classified as intervention-type if the monitor appeared in the extracted fall-event source during the run, and as control-type if it appeared in the hourly exposure source without a linked extracted fall event. Eligibility gates: at least 4 observed monitor-hours (min_observed_hours=4) and a position-coverage ratio of at least 0.95 (min_coverage_ratio=0.95). Of 5,531 units in the hourly data, 3,980 passed both eligibility criteria (intervention: 42 eligible; control: 3,938 eligible). High fall-risk context was informed by standard clinical screening including the Hendrich II model.

Exposure Measurement

Exposure expressed in fractional person-hours by position. Chair exposure per eligible hourly row = pct_chair / 100 hours; bed exposure = pct_bed / 100 hours. Total exposure across all intervention-eligible unit-hours: 320.51 chair-hours and 5,121.42 bed-hours. Analysis base: 292,914 rows after eligibility filtering.

Fall Ascertainment

Within the study window (August 2024–December 2025), 43 fall events were identified from the adjudicated consensus record; 40 linked to eligible analysis-base hours and entered the primary exposure-normalized analysis. A broader monitoring feed (2022–2026, n=91 deduplicated events) was retained for descriptive context only. Hard labels at alarm time: chair (n=5), bed (n=23), room/no_patient (n=12). Adjusted modeling used probabilistic rather than hard-label event allocation.

Auxiliary Annotation Cohorts

Four non-interchangeable cohorts support different descriptive tasks: (1) the study-window adjudicated cohort (n=43); (2) the 40-event inferential base; (3) the broader observation cohort (32 deduplicated events, 37 source rows) for mechanism coding; and (4) the departure-aware subset (30 truth rows, 31 scored sequences) for exit-concordance diagnostics. Only the 40-event inferential base contributes to the primary RR estimate.

Statistical Analysis

Unadjusted rates calculated as events per 1,000 exposure-hours. Adjusted analyses used a Poisson GLM with log-link and offset log(exposure_hours). Covariates: patient position (chair vs. bed; reference = bed), seven local-time windows (00:00–05:59 through 21:00–23:59), day of week, calendar quarter, division identifier. Adjusted RR = exp(β_chair); 95% CIs via Wald intervals on the log-RR scale. Overdispersion assessed via deviance/df and Pearson χ²/df. Pre-specified negative-binomial fallback at threshold >1.5 (not triggered; deviance/df = 0.13). Robustness: HC3 heteroskedasticity-consistent SEs and division-level cluster-robust covariance. All analyses in Python (statsmodels).

Sensitivity Analyses

Pre-specified sensitivity analyses: seven local-time window restrictions; weekday-only and weekend-only restrictions; position-certainty hours only (chair or bed fraction ≥0.50); five misclassification scenarios (symmetric 10%/20%/30% chair–bed swaps; one-sided 20% swaps); eligibility thresholds at 4, 12, 24, and 48 hours; furniture-origin reclassification of chair-origin room falls.

Furniture-Exit Detection Concordance

Two AI departure signals were evaluated against consensus-annotated furniture_departure_time using a ±5-minute search window: (1) nudge boundary crossing (first green→yellow/red transition in the operational nudge_state indicator); (2) location-label transition (first change in dominant_location_label away from the patient's known last-furniture position). Concordance quantified as bias (AI minus consensus), mean absolute error (MAE), and percentile-based distributions.

Label Evaluation

Automated label quality evaluated against the departure-aware benchmark subset (30 truth rows, 31 scored sequences). Metrics: macro F1, ECE (10-bin), detection F1, detection precision, detection recall, latency MAE.

Ancillary Multimodal LLM Benchmark

Three multimodal LLMs (Gemini 2.5 Flash, Gemini 2.5 Pro, Gemini 3.1 Pro Preview) independently processed de-identified adjudicated event clips (81 clips; v2 consensus package). Primary comparison: 3-way overlap subset (n=42 visible primary rows). Metrics: fall sensitivity, non-fall specificity, pre-fall location accuracy, fall-time MAE. Reported as ancillary validation only.

Ethics and Data Governance

Data collected from patients admitted to one of eleven hospital partners across three U.S. states. The study followed CHAI standards under a Business Associate Agreement. Patients provided written informed consent for monitoring as part of standard inpatient care (HIPAA-compliant). All video data blurred prior to storage; no identifiable information included. Face-blurred frames used only for training purposes. The outcomes of this analysis did not influence patient care or clinical outcomes.

Figure 1. STROBE Cohort Flow

Figure 1. STROBE cohort flow for the denominator cohort, inferential event base, and descriptive cohorts.

Results

Using position-specific exposure hours rather than occupied bed-days as the denominator, chair time showed a higher descriptive fall rate than bed time (17.8 vs. 4.3 falls per 1,000 exposure-hours). Within the study window, 43 adjudicated events matched the monitoring pipeline and 40 linked to eligible analysis-base hours for adjusted modeling, yielding a primary chair-vs-bed RR of 2.35 (95% CI 0.87–6.33; p=0.0907). In a separate broader observation cohort (n=32), 6 of 7 direct chair falls carried a footrest/positioning tag.

Table 1. Cohort summary statistics for the chair-falls retrospective cohort analysis.
Parameter	Value
Study period	August 2024 – December 2025
Total monitors (hourly data)	5,531
Total monitors (cohort map)	5,570
Eligible monitors (both gates)	3,980 / 5,531 (72.0%)
Intervention-eligible	42
Control-eligible	3,938
Total valid hourly exposure rows	356,391
Analysis base rows (after eligibility)	292,914
Study-window adjudicated events matched to pipeline	43
Inferential fall events linked to analysis base	40
Broader monitoring feed (2022–2026, descriptive only)	91 deduplicated events
Broader observation cohort (mechanism coding)	32 deduplicated events
Departure-aware benchmark subset	30 truth rows / 31 scored sequences
Chair exposure (hours)	320.51
Bed exposure (hours)	5,121.42
Eligibility gates	`min_observed_hours=4`; `min_coverage_ratio=0.95`
Primary model	Poisson GLM; offset=log(exposure_hours)

Descriptive Fall Rates by Position (Unadjusted)

Among intervention-eligible units, the probability-weighted descriptive chair fall rate was 17.8 per 1,000 exposure-hours (5.69 expected falls / 320.51 chair-hours) and the bed rate was 4.3 per 1,000 exposure-hours (22.05 expected falls / 5,121.42 bed-hours).

Table 2. Unadjusted fall rates by patient position, intervention-eligible units.
Position	Hard-label events	Exposure (hrs)	Hard-label rate /1,000 h	Expected rate /1,000 h
Chair	5	320.51	15.6	17.8
Bed	23	5,121.42	4.5	4.3

Expected rates are probability-weighted; hard-label counts reflect AI-assigned position at event time. Analysis restricted to intervention-eligible units (n=42).

Figure 2. Unadjusted Fall Rates per 1,000 Exposure-Hours by Patient Position

Figure 2. Unadjusted fall rates per 1,000 exposure-hours by patient position among intervention-eligible monitoring units. Probability-weighted rates. Unadjusted RR = 4.12.

Primary Adjusted Relative Risk

The adjusted Poisson GLM executed on the 40-event inferential base. The primary adjusted chair-vs-bed RR was 2.35 (95% CI 0.87–6.33; p=0.0907). HC3-robust: RR 2.35 (95% CI 0.93–5.94; p=0.0709). Division-clustered (exploratory): RR 2.35 (95% CI 1.89–2.92; p<0.0001) — 9 intervention divisions only. The model showed underdispersion (deviance/df=0.13; Pearson χ²/df=0.49), so the pre-specified negative-binomial fallback was not triggered. Position-certainty restriction: non-estimable.

Table 3. Adjusted rate ratios and sensitivity analyses.
Analysis	RR	95% CI	p-value	Events	Notes
Primary (all hours)	2.35	0.87–6.33	0.0907	40	estimable
Primary (HC3)	2.35	0.93–5.94	0.0709	40	robust SE
Primary (clustered, division)	2.35	1.89–2.92	<0.0001	40	exploratory; 9 divisions
Position-certainty only	NE	—	—	38	insufficient_data
Chair-origin reclassified	2.35	0.87–6.33	0.0907	40	furniture-origin sensitivity
Eligibility threshold ≥12 h	2.39	0.88–6.47	0.0859	38	estimable
Eligibility threshold ≥24 h	2.43	0.89–6.63	0.0831	36	estimable
Eligibility threshold ≥48 h	2.42	0.84–6.98	0.1008	30	estimable
All time-window, weekday/weekend	Insufficient data — not estimable (pre-specified gate)

NE = non-estimable. Misclassification scenarios (estimable): RR 4.49–13.17 across symmetric 10/20/30% and one-sided swaps. Effect magnitude highly sensitive to label error assumptions. The clustered row is exploratory.

Figure 3. Forest Plot of Adjusted Rate Ratios Across Primary and Sensitivity Analyses

Figure 3. Forest plot of adjusted rate ratios across primary and sensitivity analyses. Diamond = primary estimate. Dashed line = null (RR=1.0). Log scale.

Mechanism Taxonomy (Observation Cohort)

Among 32 deduplicated fall events in the observation cohort: 7 direct chair falls, 12 direct bed falls, 13 room falls. Of the 7 direct chair falls, 6 (86%) carried a footrest/positioning tag. Transfer failure was the dominant mechanism for direct bed falls (6/12, 50%).

Table 4. Fall mechanism taxonomy by pre-fall location (observation cohort, n=32 events).
Mechanism	Chair (n=7)	Bed (n=12)	Room (n=13)	Total
Footrest / positioning failure	6 (86%)	—	—	6
Transfer failure	—	6 (50%)	2 (15%)	8
Other / unclassified	1 (14%)	6 (50%)	11 (85%)	18

Convenience sample for mechanism characterization only; not a population prevalence estimate.

Figure 4. Fall Mechanism Taxonomy by Pre-Fall Posture (n=32 events)

Figure 4. Fall mechanism taxonomy stratified by pre-fall patient position (observation cohort, n=32 events). 6 of 7 direct chair falls (86%) carried a footrest/positioning tag.

Furniture-Origin Chain Analysis

Including the 3 chair-origin room events with the 7 direct chair falls yields 10 chair-associated events. Post-departure latency: bed-origin events median 18 s (IQR 6–69; max 136; n=10); chair-origin events median 50 s (max 150; n=3).

Table 5. Event classification by furniture-origin chain (observation cohort, n=32).
Chain Category	Events	% of Total
Direct bed	12	37.5%
Bed-origin room	10	31.3%
Direct chair	7	21.9%
Chair-origin room	3	9.4%

Label Evaluation Metrics

Assessed against v3 adjudicated benchmark subset (30 truth rows, 31 scored sequences): macro F1 = 0.528; ECE (10-bin) = 0.450; detection F1 = 0.846; detection precision = 1.000; detection recall = 0.733; latency MAE = 37.9 s (p50=22s; p90=83s). Perfect detection precision (no false positive fall alarms), but ~1-in-4 falls missed at detection level.

Ancillary Multimodal LLM Benchmark

Table 6. Multimodal LLM fall-detection performance (3-way overlap subset, n=42 visible primary rows).
Model	Fall Sensitivity	Non-fall Specificity	Location Accuracy	Fall-time MAE (s)
Gemini 2.5 Flash	0.842	0.750	0.531	1,199
Gemini 2.5 Pro	0.684	1.000	0.654	521
Gemini 3.1 Pro Preview	0.447	1.000	0.588	106

Ancillary validation only. Different denominator structure from primary livestream pipeline; direct ranking not appropriate.

Discussion

Magnitude of Chair-Seated Risk

The main contribution of this analysis is denominator refinement rather than a definitive effect estimate. Once chair and bed time were expressed as separate exposure-hours, the descriptive rate contrast was substantial (17.8 vs 4.3 per 1,000 exposure-hours), and the adjusted RR remained above 1.0 at 2.35. However, the primary confidence interval crossed 1.0 (95% CI 0.87–6.33; p=0.0907), so the complete experiment should be interpreted as an elevated but uncertain signal rather than a confirmed multi-fold effect. The inferential event base remains modest (43 adjudicated study-window events; 40 eligible linked events).

Mechanism Insights

In the broader observation cohort, 6 of 7 direct chair falls carried a footrest/positioning tag. This concentration points to a consistent upstream problem: incomplete chair setup or unsupported lower-extremity position. Because this 32-event cohort is descriptive, broader than the study window, and not denominator-linked, the finding should be treated as signal generation for prospective testing rather than a population prevalence estimate.

Latent Bias Assessment

Position classification ambiguity. Scenario-based misclassification analyses remain the main quantitative guardrail. Across estimable scenarios, effect size changed materially (RR range 4.49–13.17), indicating sensitivity to label error assumptions even when direction is preserved.

Temporal confounding. Fall mass concentrates in waking and care-transition hours, with the highest mean chair probability in those same windows. The primary model adjusts for seven local-time windows, but sparse chair-event counts limit time-of-day clustering control. Residual temporal confounding would most likely inflate the observed chair-vs-bed RR.

Selection into position. No patient-level acuity, mobility, or staffing variables are available. Confounding by indication remains the most important unmeasured threat.

Bias audit summary. The primary interval (0.87–6.33) already allows for a substantially smaller effect than the point estimate suggests. The true causal effect could be materially attenuated or null once temporal and acuity confounders are better measured.

Post-Chair Risk Window and Furniture-Origin Analysis

Three room-classified events were preceded by chair departure, expanding chair-associated events from 7 to 10. Post-departure latency findings identify a short interval between leaving furniture and falling, supporting a clinically relevant transition window for future prospective monitoring work.

Ancillary LLM Benchmark

The Gemini models exhibited a sensitivity-specificity tradeoff consistent with the primary pipeline's label-quality findings: higher recall came at the cost of lower specificity and worse timing. No model achieved both high sensitivity and high location accuracy. The findings support the conclusion that prospective improvement in automated position classification is needed before label-derived metrics can support confirmatory inference.

Limitations

(1) Observational, single health system; no causal inference supported. (2) Label quality material constraint (macro F1=0.528; ECE=0.450; detection recall=0.733). (3) Single regional health system limits generalizability. (4) Injury outcomes not captured. (5) Observation mechanism cohort is a non-representative convenience sample. (6) Furniture-exit concordance non-estimable in this run due to instrumentation gap. (7) Small chair-fall event count (40 modeled events) produces wide CI.

Clinical Implications

Chair positioning remains clinically important for mobilization, delirium prevention, and avoidance of the functional harms of prolonged bed rest. The implication of these findings is safer chair use, not less chair use. The mechanism pattern supports targeted chair-setup confirmation before leaving chair-positioned patients unattended (footrest position, leg support, call-light accessibility) — framed as a QA hypothesis rather than a proven intervention. Stable adjusted estimates require prospective data with larger chair exposure, improved label calibration, and clinical confounders (acuity, mobility, staffing).

Conclusions

Position-specific exposure denominators identified higher descriptive fall rates during chair time than bed time (17.8 vs 4.3 probability-weighted falls per 1,000 exposure-hours), and adjusted modeling estimated a chair-vs-bed RR of 2.35 (95% CI 0.87–6.33). In a separate broader observation cohort, 6 of 7 direct chair falls involved footrest-positioning failures, pointing to a plausible modifiable prevention target. These findings remain hypothesis-generating and support safer chair use workflows, but require prospective validation before causal or policy-level conclusions.

Appendix

Table A1. Fall mechanism taxonomy by pre-fall location (observation cohort, n=32 events).
Mechanism	Chair (n=7)	Bed (n=12)	Room (n=13)	Total
Footrest / positioning failure	6 (86%)	—	—	6
Transfer failure	—	6 (50%)	2 (15%)	8
Other / unclassified	1 (14%)	6 (50%)	11 (85%)	18

Table A2. Event classification by furniture-origin chain (observation cohort, n=32).
Chain Category	Events	% of Total
direct_bed	12	37.5%
bed_origin_room	10	31.3%
direct_chair	7	21.9%
chair_origin_room	3	9.4%

Table A3. Multimodal LLM fall-detection performance (3-way overlap subset, n=42 visible primary rows).
Model	Fall Sensitivity	Non-fall Specificity	Location Accuracy	Fall-time MAE (s)
Gemini 2.5 Flash	0.842	0.750	0.531	1,199
Gemini 2.5 Pro	0.684	1.000	0.654	521
Gemini 3.1 Pro Preview	0.447	1.000	0.588	106

References

Sanchez CE, Jones R. The overlooked threat of hospital falls during the discharge period. Patient Saf. 2025;7(2):141403.
Heikkilä A, et al. Fall rates by specialties and risk factors for falls in acute hospital. J Clin Nurs. 2023;32(15–16):4868-4877.
The Joint Commission. National performance goals. 2026.
ECRI, ISMP. Top 10 patient safety concerns 2024. ECRI; 2024.
Dykes PC, et al. The ongoing journey to prevent patient falls. AHRQ PSNet; 2024.
World Falls Guidelines Working Group. Age Ageing. 2022;51(9):afac205.
Press Ganey. NDNQI [database]. 2026.
Jones LA, Altman KM. Nursing. 2025;55(6):54-60.
Gabriel P, Rehani P, Troy T, et al. Front Imaging. 2025;4:1547166.
Jones KJ, et al. J Patient Saf. 2021;17(8):e716-e726.
Sosa MA, et al. J Patient Saf. 2024;20(3):186-191.
Hendrich AL, et al. Appl Nurs Res. 2020;53:151243.
Chang Y-C, et al. BMC Nurs. 2024;23:119.
AHRQ. Preventing falls in hospitals. 2013, updated 2017.
Lee M-J, et al. Healthcare (Basel). 2023;11(15):2194.
Chiu J, et al. Int J Qual Health Care. 2025;37(2):mzaf028.
Brown CJ, et al. J Am Geriatr Soc. 2009;57(9):1660-1665.
Turner K, et al. J Patient Saf. 2022;18(1):e236-e242.
McKercher JP, et al. Age Ageing. 2024;53(7):afae149.
Spoon D, et al. BMJ Open Qual. 2024;13(4):e003006.
Shepherd J, McCarthy A. OJIN. 2025;30(2).

Clinical & Operational Guide

How to Interpret and Act on These Findings

A practical reference for clinicians, quality teams, and nursing leaders — covering methodology, interpretation, limitations, and evidence-calibrated next steps.

📐

SECTION 01

Understanding the Analysis

Traditional hospital fall metrics count falls per 1,000 occupied bed-days. The problem: a bed-day includes all hours a patient spends in bed, in a chair, walking, and during transfers. When a patient falls from a chair, that fall is divided against a bed-day denominator — dramatically understating how risky chair-seated time actually is per hour.

This analysis uses AI monitoring data to track exactly how many hours each patient spent in a chair vs. in bed. Falls are then divided by the actual chair-hours or bed-hours observed. The result is a rate in falls per 1,000 exposure-hours — an apples-to-apples comparison across positions. Think of it like car vs. motorcycle safety: you can't compare accidents per day; you need accidents per mile driven.

The LookDeep platform continuously processes room-level video and assigns probabilistic position estimates at sub-minute resolution. Each hour is summarized as three fractions: pct_chair, pct_bed, and pct_ambulatory — always summing to 100%.

For this analysis, 356,391 hourly rows were extracted spanning August 2024–December 2025, all with valid 100% position sums. Falls were linked to these hourly exposure records via monitor and date-hour keys.

A Poisson GLM with log-link and log(exposure_hours) offset — the standard epidemiological approach for rate-ratio estimation when outcomes are counts. The model adjusted for: seven local-time windows, day-of-week, calendar quarter, and hospital site. Robustness confirmed with HC3 heteroskedasticity-consistent SEs and division-level cluster-robust SEs.

What was not controlled for (data unavailable): patient acuity, nursing staffing ratios, patient mobility classification. These are the key residual confounders motivating prospective validation.

A separate convenience cohort of 32 deduplicated fall events coded via structured dual-review. Mechanism groups: other (18), transfer failure (8), footrest/positioning (6). Within the 7 direct-chair events, 6 were tagged as footrest/positioning failures.

Important: This is a QA/internal convenience sample, not a probability sample of all falls. Mechanism proportions should be treated as hypothesis-generating, not as population prevalence estimates.

🔍

SECTION 02

Interpreting the Results

The Rate Ratio (RR) of 2.35 means that, after adjusting for time-of-day, day-of-week, quarter, and site, chair-seated patients experienced an estimated 2.35× the per-hour fall rate of bed-bound patients. The CI (0.87–6.33) crosses 1.0, meaning the data are compatible with anything from a slightly lower to a substantially higher chair-associated rate.

The wide CI partly reflects sparse data: only 320.5 chair-hours and 40 eligible linked falls were available. Larger datasets would narrow the interval.

No. This is an observational study. Key unmeasured confounders include patient acuity (higher-acuity patients may be selectively placed in chairs), nursing staffing ratios, and patient mobility classification. The correct interpretation: "Chair-seated patients showed higher exposure-normalized fall rates in this dataset — a signal that motivates prospective study, not a proven causal relationship."

Estimable analyses: HC3-robust, clustered-SE, and chair-origin reclassified analyses all preserved RR=2.35. Eligibility-threshold sensitivities at 12/24/48 hours stayed near 2.4.

Insufficient-data: Position-certainty-hours-only, time-window, weekday, weekend subgroups all marked non-estimable — a pre-specified safeguard against sparse subgroup over-interpretation.

Misclassification: Estimable swap scenarios ranged 4.49–13.17. Effect magnitude remains highly sensitive to label accuracy assumptions.

Some "room" falls actually originated from a chair — the patient left the chair then fell in the room. Tracking this provenance via the last_furniture field expands chair-associated events from 7 to 10 in the observation cohort (n=32). Post-departure latency: median 18 s for bed-origin (IQR 6–69, n=10), 50 s for chair-origin events (n=3).

The reclassification sensitivity test (RR 2.35, same as primary) shows this expanded definition did not materially change the modeled allocation in this run. Likely because the events fall outside the primary eligible analysis base.

⚠️

SECTION 03

Key Limitations

Limitation	Severity	Implication
Position classifier accuracy macro F1=0.528 · detection F1=0.846	Material	Some falls may be position-misclassified. Effect size sensitive to label-swap assumptions. Classifier improvement is prerequisite for stronger claims.
Single health system	Material	Generalizability unknown. Local chair hardware, camera placement, and nursing workflows may differ substantially across systems.
Unmeasured confounders Acuity, staffing, mobility class	Material	Residual confounding likely. Selection of high-acuity patients into chair placement could inflate the observed RR.
No injury-severity data	Moderate	Falls without injury included alongside injurious falls. Severity-weighted analyses not possible.
Mechanism cohort non-representative n=32 convenience-sample events	Moderate	86% footrest/positioning pattern is hypothesis-generating only. Not a population prevalence estimate.
Small chair-fall event count 40 modeled events · 320.5 chair-hours	Lower	Wide CI (0.87–6.33). All sub-group analyses underpowered.

✅

SECTION 04

Action Checklist for Clinical Teams

✓
Share with nursing leadership as a QA signal — frame as hypothesis-generating. Include CI and limitation disclosures alongside any summary figures.
✓
Review current chair-setup protocols — does footrest confirmation and call-light accessibility appear in existing positioning workflows before leaving chair-seated patients unattended?
!
Do not implement major protocol changes based solely on this study — observational, single-site data with unmeasured confounders cannot support causal policy decisions.
✓
Design a prospective pilot — partner with nursing quality and informatics to prospectively track chair-setup confirmation adherence against chair fall events in monitored units.
✓
Request classifier improvement roadmap — macro F1=0.528 is a prerequisite blocker for confident multi-site claims. Escalate to the LookDeep platform team.
✓
Obtain stakeholder sign-off on label quality limitations — Gate 2 passed with report-only thresholds. Explicit acknowledgment required before public or policy-level sharing.
!
Do not share externally without full limitation disclosure — DRAFT. Synced to submitted arXiv bundle. STROBE checklist in docs/strobe_checklist.md.

📖

SECTION 05

Key Terms Glossary

Rate Ratio (RR): The ratio of fall rates between two groups (chair ÷ bed). RR=1 means equal rates. RR=2.35 means chair-seated patients fell at an estimated 2.35× the per-hour rate of bed-bound patients.

95% CI: The range of values consistent with the observed data. Here (0.87–6.33) is wide because the event count is small. If you ran this study 100 times, 95 of the resulting intervals would contain the true RR.

p-value: p=0.0907 means this result would occur about 9% of the time under the null. The conventional significance threshold (p<0.05) is not met, which is why the result should be framed as uncertain.

Poisson GLM: A regression model for count outcomes. Using log(exposure) as an offset converts this to a rate model, so the coefficient on position directly estimates the log rate ratio.

HC3 Standard Errors: A correction for heteroskedasticity (non-constant variance) that makes tests valid without changing the point estimate.

Cluster-Robust SEs: Accounts for within-division correlation. Exploratory here because only 9 intervention divisions were available.

macro F1 (0.528): Average F1 across all position classes, weighted equally. 1.0 is perfect; 0.528 indicates material limitations, especially away from the bed class.

Detection F1 (0.846): Event-level F1 for detecting that a fall occurred. Precision=1.000 (no false alarms in this benchmark); Recall=0.733 (some real falls still missed).

Gates 1/2/3: Gate 1 = data extraction validation. Gate 2 = label quality evaluation. Gate 3 = de-identification clearance. All three passed as prerequisites to inferential reporting.