GATE B
LANDED
physical copy on intake spindle
0
items here now
—
aggregate size
What this gate means
An item is at LANDED once the physical bytes are on local storage and checksummed. The pull is done; nothing has been sorted, deduped, or moved yet. This is the gate where the data costs the most disk space — the raw archive is fully expanded AND the source may still be reachable. Don't delete the source until checksums verify clean.
Sub-routines · operational playbook
Bulk pull · rclone
rclone copy --transfers=4 --checksum --retries=10
Standard pull pattern for any remote (Google Drive Takeout, GCS Workspace bucket, S3, B2, etc). The
--checksum flag verifies bytes server-side where the backend supports it; --retries=10 rides over transient network blips overnight. Use --bwlimit if you need to throttle during work hours.rclone copy : /Volumes/// \
--transfers=4 --checksum --retries=10 --progress \
--log-file=pull-$(date +%Y%m%d-%H%M).log --log-level=INFO
when: Every cloud pull
Verify integrity
md5sum / sha256sum, write checksums.txt BEFORE deleting source
Don't trust 'no errors' alone. Generate checksums on the local copy, sample-verify against the source where possible (Google Drive exposes MD5 via API), and persist
checksums.txt in the intake directory. This is the receipt that proves the local copy is good before you let go of the source.cd /Volumes/// && find . -type f -exec md5 {} \; > checksums.txt
# spot-check against source
rclone hashsum md5 : | sort > remote-checksums.txt
when: Immediately after rclone completes, BEFORE source deletion
Landing target decision
external G-Drive (cloud chunks) OR RAID intake partition (>5 TB items)
Small cloud chunks (<2 TB per source) land on the external G-Drive scratch spindle. Big items (existing RAID + backups dedup pass, old-iMac/HD images) land on a dedicated intake partition on the RAID — keeps the canonical home clean while triage happens in scratch space.
when: Decide BEFORE pulling — moving 1 TB after the fact is wasted hours
Bandwidth budget
schedule overnight pulls; 100 Mbps = ~1 TB / 24 hr
Cloud pulls run at the connection's sustained throughput, not peak. Treat each chunk as overnight work and queue serially so daytime bandwidth stays free. The Family iCloud (2 TB) + Workspace exports (2 TB each) will each want a full overnight window.
# at-job to kick off at 23:00
echo 'rclone copy ... --bwlimit=80M' | at 23:00
when: When the household needs bandwidth during the day
Items at this gate
No items currently at this gate.
Gate-exit checklist
Verify before moving items into CATALOGED:
- All bytes copied (rclone exit code 0)
- checksums.txt persisted in the intake directory
- Source verified against local hash (sample or full)
- Free space remaining on landing spindle ≥ 20%
Gate B · LANDED · baked 2026-05-29 from
migrations.yml + GATE_DETAIL