James Gathogo Back to Portfolio

Digital Data Collection Design

XLSForm architecture, skip logic, validation, and multilingual deployment on KoBoToolbox/ODK for large-scale education and disability inclusion programmes.

KoBoToolbox XLSForm / ODK GEC-T Programme FCDO-funded

XLSForm Structure

A realistic GEC-T inclusive education form: consent, demographics, Washington Group disability screening, cascading geography, repeat groups, and GPS capture.

typenamelabelrelevantconstraintconstraint_messagerequiredappearance
survey
select_one yes_noconsentDo you give informed consent to participate?yes
textstudent_idStudent unique ID${consent} = 'yes'regex(., '^GEC-[0-9]{4}$')Format: GEC-0001yes
select_one schoolsschool_nameName of institution${consent} = 'yes'yesautocomplete
select_one countiescountyCounty${consent} = 'yes'yesminimal
select_one sub_countiessub_countySub-county${consent} = 'yes'yesminimal
select_one wardswardWard${consent} = 'yes'yesminimal
integerageAge of student (years)${consent} = 'yes'. >= 5 and . <= 25Age must be between 5 and 25yes
select_multiple disability_listdisability_typeDisability domain (WG-SS)${consent} = 'yes'yes
select_one aid_typesvisual_aid_typeType of visual assistive deviceselected(${disability_type}, 'visual')yes
begin_repeathousehold_membersHousehold member details${consent} = 'yes'
texthh_nameName of household memberyes
select_one relationshiphh_relationRelationship to studentyes
end_repeat
select_one gradesgradeCurrent grade/class${consent} = 'yes'yesminimal
select_one school_typesschool_typeType of institution${consent} = 'yes'if(${grade} > 8, . = 'secondary', true)Grade 9+ requires secondary schoolyes
dateenrollment_dateDate of enrolment${consent} = 'yes'. <= today()Enrolment date cannot be in the futureyes
textcaregiver_phoneCaregiver phone number${consent} = 'yes'regex(., '^\+254[0-9]{9}$')Use format +254XXXXXXXXXno
imageschool_photoPhoto of school entrance${consent} = 'yes'no
geopointgps_locationGPS coordinates of institution${consent} = 'yes'yesplacement-map
list_namenamelabelcounty (filter)
choices
yes_noyesYes
yes_nonoNo
disability_listvisualSeeing - even with glasses
disability_listhearingHearing - even with hearing aid
disability_listmobilityWalking or climbing steps
disability_listcognitiveRemembering or concentrating
disability_listselfcareSelf-care (washing, dressing)
disability_listcommunicationCommunicating in usual language
countiesnairobiNairobi
countieskisumuKisumu
countiesmombasaMombasa
sub_countieswestlandsWestlandsnairobi
sub_countieskasaraniKasaraninairobi
sub_countieskisumu_centralKisumu Centralkisumu
sub_countieschangamweChangamwemombasa
schoolssch_001Moi Avenue Primary
schoolssch_002Uhuru Gardens Special School
aid_typesspectaclesSpectacles / glasses
aid_typesmagnifierMagnifying device
aid_typesbrailleBraille materials
form_titleform_idversiondefault_languagestyle
settings
GEC-T Inclusive Education Endline 2024gect_ie_endline_v32024.03.1English (en)pages

Form Routing Visualisation

Branching logic ensures enumerators only see relevant questions, reducing survey fatigue and data errors.

Start Survey
Informed Consent?
Yes
Demographics & Geography
Washington Group Questions
Disability detected
Disability-Specific Module
No disability
General Education Module
School Performance
GPS + Photo Capture
End & Submit
No
End (No Data Collected)

Validation Rules

Built-in constraints that catch errors at the point of collection, not after data cleaning.

Age Range

Student age must fall within programme eligibility criteria.

. >= 5 and . <= 25

Date Logic

Enrolment date cannot be in the future.

. <= today()

Cross-Field Validation

If student is in grade 9+, school type must be secondary.

if(${grade} > 8, ${school_type} = 'secondary', true)

Phone Number Regex

Kenyan mobile format: +254 7XX XXX XXX.

regex(., '^\+254[0-9]{9}$')

Conditional Required

Disability module only required when consent is granted.

required: ${consent} = 'yes'

GPS Accuracy Threshold

Reject GPS readings with accuracy worse than 10 metres.

distance(., .) = 0 and selected-at(., 3) < 10

From Collection to Dashboard

End-to-end flow from KoBoToolbox field data to actionable programme dashboards.

📱
Collect
KoBoToolbox / ODK Collect on Android tablets with offline capability
🔌
API / Export
KoBo REST API v2 or scheduled CSV/XLSX exports with token auth
🧹
Clean & Transform
Python / R scripts: deduplication, recoding, WG-SS scoring, PII removal
📊
Dashboard
Power BI (donor/HQ) or Google Sheets (field teams) with live refresh
  • Duplicate submission detection via student_id
  • GPS outlier flagging (> 50 km from expected county centroid)
  • Completeness threshold: reject forms below 80% field coverage
  • Weekly data quality report (Python-generated PDF)
  • Real-time submission tracker (Google Sheets)
  • Quarterly indicator dashboard (Power BI, 6 pages)

Multi-Language Form Deployment

Forms deployed in multiple languages allow enumerators to switch between English, Kiswahili, and Amharic during interviews.

namelabel::English (en)label::Kiswahili (sw)label::Amharic (am)
consentDo you give informed consent to participate?Je, unatoa idhini yako kushiriki?ለመሳተፍ ፈቃደኛ ነዎት?
student_idStudent unique IDNambari ya kipekee ya mwanafunziየተማሪ ልዩ መለያ ቁጥር
ageAge of student (years)Umri wa mwanafunzi (miaka)የተማሪ ዕድሜ (ዓመት)
disability_typeDisability domain (WG-SS)Aina ya ulemavu (WG-SS)የአካል ጉዳት ዓይነት (WG-SS)
school_nameName of institutionJina la shuleየትምህርት ቤት ስም
gps_locationGPS coordinates of institutionKuratibu za GPS za shuleየትምህርት ቤት GPS መጋጠሚያ
gradeCurrent grade/classDarasa la sasaየአሁኑ ክፍል
caregiver_phoneCaregiver phone numberNambari ya simu ya mleziየአሳዳጊ ስልክ ቁጥር

Technical Specifications

GEC-T Inclusive Education Programme, Leonard Cheshire, Kenya. FCDO-funded external evaluation across three data collection rounds.

83
Institutions surveyed
2,100+
Girls with disabilities tracked
3
Rounds (BL / ML / EL)
4
Languages supported
WG-SS
Washington Group integrated
NACOSTI
Ethics approval obtained

Cascading selects: County → Sub-county → Ward geography uses choice_filter to dynamically filter options based on prior selection, reducing enumerator error in field conditions.
Washington Group Short Set: Six functional difficulty domains aligned to the WG-SS standard, enabling international comparability of disability prevalence data across GEC programmes.
Repeat groups: Household member module uses repeat_count to capture variable-length household rosters without duplicating form sections.
Offline-first: All forms are deployed via ODK Collect / KoBo Collect with full offline capability; submissions queue and sync when connectivity is restored.