My journey building a cloud based, open source, serverless, customer deployed, scalable spreadsheet.
See InfiniSheet for my attempt at implementing some of these ideas.
-
React Spreadsheet
Spreadsheets
Spreadsheet Data Model
Last time, we defined a minimal data interface and hooked it up to
VirtualSpreadsheet
. Now I need to get a better understanding of the standard spreadsheet data model so that I can flesh out the interface to match. -
React Spreadsheet
Spreadsheets
Spreadsheet Column Naming
Last time, I ported some spreadsheet-ish sample code into my new stub
react-spreadsheet
package, and called it a spreadsheet. Unsurprisingly, it’s not very good. Time to start iterating. -
React Virtual Scroll
Spreadsheets
React Virtual Scroll 0.4.0 : Customization
My
VirtualList
andVirtualGrid
components use the same approach as React-Window. A lean and mean implementation that focuses just on virtualization. This is not SlickGrid. The idea is that you can use customization to build whatever higher level functionality you need on top. -
React Virtual Scroll
Spreadsheets
React Virtual Scroll Grid 4 : Big Grid
After a long trip down the rabbit hole, I have two working implementations of a React based virtual scrolling list. No flicker, no going blank while scrolling.
-
Front End
Spreadsheets
React Virtual Scroll
Paged Infinite Virtual Scrolling
I’m working on a cloud spreadsheet system. It will support spreadsheets with millions of rows and columns. Potentially far more data than will fit into client memory, particularly a web client. Which means I need a front end implementation that can handle that.
-
Cloud Architecture
Spreadsheets
Consistency for Event Sourced Systems
I’m a big fan of Event Sourced systems. I have a whole series of posts on implementing a cloud spreadsheet using event sourcing. However, so far, I’ve mostly waved my hands and told you that everything is wonderful.
-
Spreadsheets
AWS
Eventual Consistency for an Event Sourced Spreadsheet
Last time we looked at general approaches to ensuring eventual consistency in the cloud. Now it’s time to apply what we’ve learnt to the case of my Event Sourced Cloud Spreadsheet. Previously, I went into some detail on how to implement an Event Log using DynamoDB. Long story short, there are some operations that involve multiple writes and some that need to trigger side effects.
-
Spreadsheets
Databases
AWS
Implementing a Spreadsheet Event Log on DynamoDB
In the distant past, before I got sucked into a seemingly never ending series on databases, I said that I was going to start formalizing the format for my cloud based, serverless, event sourced spreadsheet. I realize now that I’ve said very little on how I’m going to implement the central component of my spreadsheet, the event log.
-
Spreadsheets
Merging and Importing Spreadsheet Snapshots
Last time, we looked at the added complexity that comes when you start inserting and deleting rows and columns from your spreadsheet. Spreadsheet snapshots are made up of multiple segments. Once you start inserting and deleting things, those segments are in different coordinate spaces. You need to transform the earlier segments into the coordinate space of the most recent as you load them.
-
Spreadsheets
Cloud Architecture
Making Spreadsheet Snapshots work with Insert and Delete
When you’re implementing a cloud spreadsheet, it’s tempting to think of it as just another kind of database. Each row of the spreadsheet is equivalent to a row in a database. Each column in the spreadsheet is equivalent to a column in a database. Yes, spreadsheets don’t have schemas. Yes, spreadsheets can have lots of columns. However, there are plenty of examples of NoSQL databases that are schemaless and have wide column stores.
-
Spreadsheets
Cloud Architecture
AWS
Data Structures for Spreadsheet Snapshots
I have a plan. After a round of brainstorming and benchmarking, I’ve decided to use Event Sourcing to store the sequence of operations applied to a spreadsheet. Every so often I’ll create snapshots of the current spreadsheet state. I can then load the spreadsheet at any point in time by loading a snapshot and applying changes from that point on in the event log.
-
Spreadsheets
Cloud Architecture
AWS
Brainstorming and Benchmarking
Last time I took you on a tour of the world’s most boring spreadsheet. I used a basic, if large, spreadsheet to identify some benchmarks that I can use to assess the viability of the crazy implementation ideas we’re going to brainstorm. The benchmarks are by no means exhaustive - think of them as the very low bar that any idea needs to get over to be worth considering further.
-
Spreadsheets
The World's Most Boring Spreadsheet
What’s the best way to get started with a big new project? Something like building an open source, serverless, customer deployed, scalable spreadsheet from scratch?
-
Spreadsheets
Cloud Architecture
Spreadsheets are the Future of Data Systems
The final chapter of Martin Kleppmann’s wonderful book Designing Data-Intensive Applications is called “The Future of Data Systems”. In this chapter he talks about data integration between different specialized systems using flows of derivative data, unbundling today’s complex databases into simpler specialized data storage components and composing them with dataflow processing systems. At one point, almost as a throw away remark, he mentions that spreadsheets already have most of the dataflow programming capabilities that such a system would need. Of course, a spreadsheet is just a spreadsheet. A real data system needs to be durable, scalable and fault tolerant. It needs to integrate with a wide variety of disparate technologies.
-
Navisworks
Autodesk
Spreadsheets
Tools vs Solutions
James Awe is the first Software Architect I met at Autodesk. He was involved in the acquisition process for Navisworks, where I was then CTO. Some years later he shared the story of how he first became aware of Navisworks.