DSR Number 2
another doctoral student review, another blog
Post created: 2021-05-07
Migrated my blog to Zola, mainly because I felt that there was too much magic going on in Hugo and because I cannot focus at all right now.
Tomorrow (or well, today now) is the deadline for another doctoral student review, so what have I been up to?
Engineering, per GitHub activity:
- Dec
- Annoying CI issues: brew, ant, openjdk, and we eventually just dropped Mac support.
SHOW transaction_isolation
- I think the SQLancer testing team needed this?SanctionedSharedPointer
- fancy name for essentially a typedef...- More annoying CI issues, we add
lsof
calls hoping to catch something. Note from the future, nothing was caught. - Fix iterator invalidation in binding insert statements. Sneaky binder bugs.
- Wrote some documentation for network folder.
- Replication seemed to work.
- Spoiler from the future: it didn't work.
- Jan
- Fix missing includes.
- Continue fighting CI, this time:
- AdaptiveCheckTest is flaky, increase the sleep.
- Tried to fix
messenger_test
. Note from the future, didn't work. - Add
ptrace
to all CI docker containers so that we canssh
in andgdb
if we're fast enough. - Tried to fix
messenger_test
again.
- Replication.
- Matt pointed out SingleStore had usefully defined policies. Made slight progress. Spent most of the last two weeks of Jan on the following decimals PR.
- Was trying to review and help fixup the fixed Decimals PR. Did not merge in the end.
- I probably should have pushed back harder on review.
- But it had also been sitting there so long without anyone reviewing it.
- Oh well.. this one was a bit of a bummer.
- On a related note, ported
intCast
for FixedDecimals. Got no reviewer, didn't merge.- Punted question on whose duty it is to provide the requisite information -- optimizer or execution engine, who figures it out?
- Feb
- Refactor
scripts/testing
. Was hairy. Is still hairy.- By using relative imports, at least CLion can tell you all usages.
- More CI issues.
- Import issues, path issues, etc.
- Delete domain socket, simplify junit trace reporting, reorder Jenkinsfile...
- Try to link less to save memory, try specifying build machines...
- Surprise unity build failure, fix that too...
- Update packages.sh to save some time on pip installs...
- While all this was happening, working on replication.
- Takeaway: It doesn't work on their local machine until it works in CI and/or a dev machine. Demos can be extremely misleading.
- First implementation fuckup: believing that sync replication simply means "message received". (Interestingly, SingleStore does choose this definition).
- Second implementation fuckup: conflating message received with message applied.
- Pretty constant progress from Feb 10ish to end of Feb, went through a couple of review cycles.
- Refactor
- Mar
- Quick documentation for debug symbols on clang builds.
- Merged replication from Feb.
- Quick fix to memory leak.
- Turns out CI wasn't cheecking DBMS exit code, so check that.
- Random enum autostringify PR to the side, cleaned up and merged.
- Add more documentation about debugging, this time CLion's attach debugger.
- Around mid March, realize replication is still broken as hell.
- Gave up on porting and branched off.
- Spun out tcop latch into a separate PR.
- Rewrote replication.
- Rewrote the Messenger to have guaranteed delivery semantics. Now all clients must implement the retry.
- The definition of active transaction shifted from when they first came up with replication, to when they wrote it. Was always unsafe (can stall) in sync case. Fixed with an extra NotifyOAT.
- The performance is hilariously bad (2x slower async, 100x slower sync) but we profiled that it was related to the serialization.
- Hopefully it really works now...
- Apr
- Merged replication.
- Small print message changes: QoL fix with startup message, remove print statement accidentally added.
- Added a command line flag for replication.
- Port LLVMEngine::Optimize to skip more passes.
- Interesting note from the future: transform passes are pretty useless right now?
- Refactor Jenkinsfile scripts, this time for 884.
- Then another PR to fix the inevitable breakages.
- Refactor stuff related to getting the pilot running end-to-end safely in CI.
- Help poke at EXPLAIN.
- We should have a wireshark tutorial.
- May
- 745 project concluded that it wasn't worth merging.
- But I have a decent idea of what to instrument where to collect optimize times now, if EXPLAIN ANALYZE team becomes a thing.
- Pending PR to just run check-clang-tidy on every file in the main build.
- 745 project concluded that it wasn't worth merging.
High-level thoughts:
- I spent a lot of time on CI. I think I would have spent less time on CI if we were running with glued together barebones shell scripts.
- Replication is rather tricky.
Research:
- Replication works now. Beyond that, didn't make much progress...
- Started reading Settles' active learning monograph.
- I am getting quite familiar with all parts of the system, except for storage and the optimizer. Poked around a lot in ML stuff / execution engine stuff this semester.
Coursework:
- 745: Homework can be annoying. Specifically, LICM is a pain to implement. Soloing the programming was probably not the move, but whatever. Dataflow is conceptually interesting and satisfying, but I'm not sure that, e.g., LLVM or Jetbrains are known for their conservativeness. It would be nice if the course material was refreshed.
- 884: Chill class, reasonably interesting material, easy to bounce questions off the prof.
Other:
- I started reading Database Internals and DDIA. Pretty good books. Would use over the DSC book.
- I made (or tried to make), in chronological order: bread pudding, sanmai oroshi'd a mackerel, various sashimi bowls, tempura ebi udon, soto ayam, japanese eggplant, martabak manis.
- I got both shots of Moderna.