380 Web Scalability for Startup Engineers
AMI (Amazon Machine Image), 115, 339 overview of, 246
AMQP (Advanced Message Queuing Protocol), shopping analogy of, 254–256
summary, 301–302
265, 288–289 synchronous example, 247–250
anti-patterns, message queue, 282–284 synchronous processing vs., 246
Apache mod_proxy, 224 Atlassian Bamboo, 337
Apache server, 53–54 at-least-once delivery, message requeueing, 280
Apache Traffic Server, 224 atomic counters, application-level sharding, 184
API-first design, web services, 127–129, 130–131 atomic transactions, ACID, 177
APIs (application programming interfaces) authentication, 138, 141
authorization, stateless web services for, 141–142
learning simplicity from, 42–43 automation
promoting simplicity with TDD, 41 build and deployment, 49, 335–340
in web application development, 124 Cassandra, 202–203
web service design not using, 124–127 log aggregation, 345–347
Apollo, 292 monitoring and alerting, 340–345
application architecture overview of, 332–333
front end, 28–30 scaling with load balancers, 105
overview of, 27–28 self-healing repair process, 79–80
supporting technologies, 34–35 testing, 333–335
web services, 30–34 auto-scaling
application metrics, reducing MTTR, 342 front-end scalability with, 114–116
application servers with load balancers, 105–107
caching objects directly on, 231, 232 stateless web service machines with, 140
deploying web services in parallel to, 25 availability
distributed storage/delivery of public files, 94 CAP theorem and, 191–192
object cache servers using front-end, 25 message queues evening out traffic spikes
placing reverse proxy servers, 220
serving private files, 95 for, 273–274
stateful web, 86 MySQL ring replication reducing,
using locks, 98–99
in web application layer, 24 164–165
applications reduced thoroughput vs., 274–275
caching objects. See object caches Azure
promoting loose coupling, 44–46 auto-scaling with, 115
architecture Blob Storage for files, 93–95
application. See application architecture Diagnostics, 346
Cassandra, 199–204 Load Balancer, 111, 140
event-driven. See EDA (event-driven load balancer as hosted service in,
architecture) 106–107
flexibility of good, 316 scalability limits in Queues, 269
function-centric web services, 131–134 Search, 329
resource-centric web services, 134–138 Service Bus Queues, 269
Astyanax Chunked Object Store, Netflix, 96 SQL Database Elastic Scale, 184
asynchronous nature, of MySQL replication,
158–159 B
asynchronous processing
comparing messaging platforms, 284–294 back-end server termination, ELB, 107
currently in renaissance phase, 301–302 backpressure feature, RabbitMQ, 293
direct worker queue interactions as, 296–297 backup, MySQL replication, 159–160
event-driven architecture as, 295–301 bandwidth, CDN reducing, 15
example of, 249–254 benchmarks, gaining value of, 113
message queue anti-patterns, 282–284 best practices, continuous deployment pipeline,
message queue benefits, 270–276
message queue challenges, 276–282 339–340
message queues in, 256–270 big data, 25
Big O notation, 305, 307–308, 328
Index 381
Big-IP, F5, 109–111 caching proxy, 219–220
BigTable, 190–191, 317 caching servers
binary search algorithm, 306–307
bindings, RabbitMQ, 264 caches co-located with code, 230–232
binlog file, MySQL replication, 157–158, 161 distributed, 232–233
Blob Storage, Azure, 93–95 scaling object caches, 238
blocking, 248–249 call stack
blocking I/O, 248 asynchronous processing as programming
book index structure, 306, 328
book references, for this book, 364–366 without, 281
browser cache caching high up, 239–240
latency dictated by weakest link in, 296
HTTP-based, 208, 218–219 callback
scalability of, 223 asynchronous call with, 251–254
scaling front end with, 113–114 definition of, 249
buffers, active data set size, 168 shopping analogy with, 255–256
build process, automating, 49, 335–340 Camel, integration with ActiveMQ, 264, 291–292
burnout, preventing, 348–349 CAP (consistency, availability, and partition
business continuity plans, 79 tolerance) theorem, 191–192
business logic capacity, increasing
front-end servers freed of, 22–24, 29 adding more servers, 73, 77
hexagonal architecture and, 32 front line components, 22–24
pushing to web services, 24–25, 30–34 load balancer limits, 109
web application layer freed of, 24 load balancers increasing, 105
business metrics, reducing MTTR, 342 scalabilty and, 3
scaling horizontally, 16–17
C scaling vertically, 8–10
cardinality, estimating for index, 308–310
cache hit ratio Cassandra
bundling CSS/JS files to maximize, 216 failure handling, 80
overview of, 208–209 scaling own file storage/delivery, 96
scaling reverse proxies, 224, 227 self-healing strategies, 196
topology, 199–204
cache invalidation, 233, 243–244 as wide columnar data store, 317, 319–325
cache key space, 208, 224 CDNs (content delivery networks)
cache-aside caches. See object caches caching front end, 113
Cache-Control HTTP header, 213–217 definition of, 14
cached objects. See HTTP-based caches delivering public files to end users, 94
caches co-located with code, 230–232 as front-end layer component, 101
caching horizontal scaling with, 17
hosting own hardware with, 121
application objects. See object caches as HTTP-based cache, 221–222
AWS web deployment example, offloading traffic to third party with, 13
reducing bandwidth with, 15
117–119 scalability of, 223
cache hit ratio. See cache hit ratio scaling for global audience, 20–21
data partitioning using, 76–77 central processing unit. See CPU (central
definition of, 12 processing unit)
front-end components using back end, 29 centralized log service, streaming logs to, 345
HTTP-based. See HTTP-based caches Chaos Monkey, 78
local server, 97–98 Chrome, 53–54
Nginx benefits, 108 circular dependencies, 47
overview of, 208 class diagrams, 59–60
preventing MySQL replication timing classes
avoiding unnecessary coupling, 47
issues, 169 dependencies of, 60
rules of thumb for, 239–244
scaling front end using, 113–114
summary, 244
382 Web Scalability for Startup Engineers
classes (cont.) coarse locks, 143
dependency injection principle, 65–68 code
designing, 53
promoting loose coupling, 44–46 80/20 rule for, 352–353
single responsibility principle of, 61–63 problems from hacking own, 50
reducing coupling when writing, 45
clients reviews, 355
decoupling providers and, 51–54 writing vs. reusing, 49
interacting with HTTP read-through coding to contract, 51–54, 60
caches, 211–212 collocated servers, 13
in request/response interactions, 298–299 column families, Cassandra, 321–322
stateless web service machines and, 139 Common Object Request Broker Architecture
(CORBA), 132
client-side caches communication paths, 357–358
caching high up call stack, 240 complexity
overview of, 228–229 application-level sharding and, 184
scaling, 233–234 dependency injection reducing, 68
key-value stores offering least, 317
client-side conflict resolution, 195–196 message queue challenges, 281
clones promoting simplicity by hiding, 38–40
reducing coupling by hiding, 44
implementing distributed locking, 99–100 reducing with single responsibility, 61–63
load balancing by adding, 104–105, 108, 110 shared lock management service
multiple slave machines and, 158
in publish/subscribe method, 263 increasing, 99
replication challenges, 166 of SOAP, 134
reverse proxy as independent, 226 composition, and repetitive code, 50
scaling by adding, 71, 72–74 compound (composite) indexes
scaling REST web services by adding, 138, 140 Cassandra tables similar to, 319–320
scaling web cluster by adding, 232 definition of, 311
using local/independent locks, 98–99 ordering of columns in, 311–312
Cloud Load Balancers, Rackspace, 111 structuring data as, 325
cloud-based hosting wide columnar data stores using, 317
auto-scaling with load balancers, 105 concurrency, measuring for higher, 3–4
AWS web deployment, 117–119 conflict resolution
Azure SQL Database Elastic Scale, 184 client-side, 195–196
file storage, 93–95 of data store with CAP theorem, 191–192
in, 268 eventual consistency and, 194–195
load balancer, 106–107 self-healing strategies for, 196
log aggregation, 346 connection: close response header, 212
MySQL, 170 connection draining, Elastic Load Balancer, 107
stateless web service machines, 140 consistency
CloudFront, Amazon ACID transactions, 177
AWS web deployment, 117–119 CAP theorem and, 191–192
cost effectiveness of, 17 local application cache issues, 232
delivering static/dynamic content, quorum, 196–197
rise of eventual, 192–197
222–223 trading high availability for, 197–199
CloudSearch, Amazon, 329 consistent hashing, 236–237
CloudWatch, 116, 118 constraints, scalability, 4
CloudWatch Logs, 346 content delivery networks. See CDNs
clusters (content delivery networks)
continuous delivery, 336
Cassandra topology, 200, 203 continuous deployment pipeline, 336–340
reverse proxy in front of web service, continuous integration, 336
220–221
scaling by adding clones to web, 232
scaling distributed object caches, 233,
235–239
Index 383
contract, decoding to, 51–54 vertical scalability issues, 9–10
contract surface area, 298, 299 virtual private server upgrades, 7
cookies Crash-Only, 78–79
critical path, message queues, 271
establishing HTTP session, 88–89 cron-like consumer approach, 261
handling session state with load balancers, culture of alignment, engineers, 361–362
custom routing rules, consumers, 264–265
92–93 customers, single server set-up, 6
session data stored in, 89–90
copy and paste programming, 49, 50–51 D
CORBA (Common Object Request Broker
Architecture), 132 daemon-like consumer approach, 261
core concepts data
application architecture. See application
consistency in MySQL replication, 169
architecture redundancy, from denormalization, 316
data center infrastructure, 22–26 scalability issues of more, 3
defining scalability, 2–4 searching for. See searching for data
evolution stages. See evolution to global data center infrastructure
additional components, 25
audience data persistence layer, 25–26
organizational scalability, 4 front line, 22–24
scalability in startup environment, 2 overview of, 22
cost understanding, 26
benefits of ELB, 106 web application layer, 24
challenges of sharding, 176–184 web services layer, 24–25
of cookie-based session storage, 89–90 data centers
of hardware load balancers, 110 in content delivery network, 14–15
of hosting on own servers, 119 deployment of private, 119–121
influencing, 354–355 in horizontal scalability, 18
manual vs. automated testing, 333–334 in isolation of services, 14
monolithic web service design and, 126–127 load balancers as entry point to, 103
per user/transaction decreasing over time, Route 53 latency-based routing and, 101–102
scaling for global audience with multiple,
332–333
as project management lever, 349–350 19–21
saving with Amazon SQS, 285–286 data layer
of scaling by adding clones, 74
vertical scalability issues, 9–10 MySQL. See MySQL, scaling
vertical vs. horizontal scaling, 16–17 NoSQL. See NoSQL data stores, scaling
Couchbase, 318 overview of, 156
CouchDB, 79, 318 partitioning. See data partitioning (sharding)
country code, sharding by, 174–175 summary, 204
coupling data model, Cassandra, 201
avoiding producer/consumer, 283 data normalization, 189–190
class diagrams visualizing, 59 data partitioning (sharding)
definition of, 43 advantages of, 175–176
dependency injection reducing, 65–68 building own file storage/delivery, 96
direct worker queue interactions and, Cassandra automatic, 200–201
challenges of, 176–184
296–297 choosing sharding key, 171–175
in event-driven architecture, 299 implementing, 188–189
loose. See loose coupling overview of, 170–171
measuring with contract surface area, 298 putting it all together, 184–189
in request/response interaction, 296, 297–299 scaling by adding, 71, 75–77
single responsibility reducing, 61 scaling cache cluster using, 239
CPU (central processing unit)
function-centric web services and, 132
memory caches, 208
384 Web Scalability for Startup Engineers
data partitioning (cont.) dependencies
scaling distributed object caches, 235–237 class diagrams visualizing, 59–60
storing session data in dedicated session promoting loose coupling and, 44
store, 91 reducing between teams, 359
wide columnar data stores using, 317 web service, 31, 33–34
data persistence layer, data center, 25–26 dependency injection, 65–71
data set size deployment
affecting cache hit ratio, 208–209 automating process of, 49, 335–340
master-master replication and, 164 front-end layer examples, 117–121
reducing in indexes for faster search, 306 design. See software design principles
replication challenges, 166–170 design patterns
splitting in data partitioning. See data in copy-paste programming, 50–51
drawing diagrams with, 60
partitioning (sharding) for publish/subscribe model, 264
data storage, 113–114 using Amazon SQS with, 286
data stores Diagnostics, Azure, 346
diagrams
advantages of sharding, 176 circular dependencies exposed via, 47
in application architecture, 34–35 class, 59–60
horizontal scalability/high availability for, 190 drawing, 54–57
logging to, 345 module, 60–61
mapping data with, 180–182 reducing coupling on higher levels of
NoSQL era and, 191
replication of. See replication, MySQL abstraction via, 45
rise of eventual consistency for, 192–197 use case, 57–58
scaling object caches vs. scaling, 239 direct worker queue interaction, and EDA,
scaling own file storage/delivery, 96 296–297
storing session data in dedicated, 90–91 direct worker queue model,
databases routing, 262
avoid treating message queues as, 282–283 disaster recovery plans, 79
front end unaware of, 29 Distributed Component Object Model
scaling by adding clones, 73 (DCOM), 132
Datadog monitoring service, 344 distributed locking
DCOM (Distributed Component Object Model), 132 implementing, 98–101
deadlocks, 142, 143 Terracotta allowing for, 91
decision-making, 351 web service challenges, 142–143
decoupling distributed object caches
API-first design for web services, 128 cache high up call stack, 240
clients/providers, 51–54 overview of, 232–234
definition of, 44 scaling, 235–236
in event-driven interactions, 297 distributed transactions, 178
message consumers, 259 DNS (Domain Name System) server
message queues promoting, 275–276 in CDNs, 15
in MySQL replication, 158 in client caches, 208
producer/consumer, 260, 261 as front-end component, 102–103
publisher/consumer in RabbitMQ, 288–289 in front line of data center infrastructure,
dedicated search engine, using, 328–330
DELETE method, HTTP, 135–136, 211 22–23
deletes geoDNS server, 19–23
avoiding message queue, 283 in isolation of services, 12, 14
distributed object cache and, 232 in round-robin DNS service, 18–19
as limitation in Cassandra, 203–204 in round-robin–based load balancing,
local application cache issues, 232
delivery, continuous, 336 103–104
denormalization, 316, 317, 325 in single server set-up, 5–6
in vertical scalability, 11
Index 385
document IDs, inverted index structure, 327–328 EmailService interface, 60
documentation end-to-end tests, 335, 339
engineering department, scaling, 332, 349, 357–361
80/20 rule for, 352–353 equal distribution, index performance, 310–311
reducing complexity with message route, 281 ESB (enterprise service bus), 260
via diagrams. See diagrams event sourcing, EDA, 300–301
document-oriented data stores, NoSQL, 317–318 event-based interaction, EDA, 297–301
downloading files, 93–96 event-driven architecture. See EDA (event-driven
drafts, diagram, 55–57
draw diagrams, as software design principle, 54–61 architecture)
draw.io, drawing diagrams with, 60 events, definition of, 295
DRY (don't repeat yourself ), software design eventual consistency, 193–197, 203
principle, 48–51 evolution to global audience
duplication
API-first design for web services with, content delivery network, 13–16
horizontal scalability, 16–19
127–129 isolation of services, 11–13
avoiding in software design, 48–51 overview of, 5
local application cache issues, 232 scalability for global audience, 19–21
durability, ACID transactions, 177 single-server configuration, 5–7
dynamic content, CDN, 222–223 vertical scalability, 7–11
dynamic languages, front end, 111 exactly-once delivery, message requeueing, 280
Dynamo data store exchange concept, RabbitMQ, 289
as key-value data store, 317 Expires HTTP header, 215, 216
pushing conflict resolution onto clients, 195 Extensible Markup Language-Remote Procedure
scaling with NoSQL, 190–193 Call (XML-RPC), 132
E F
eBay bidding application failover
local locks preventing scaling out in, Azure automatic, 239
98–99 MySQL maintenance timeline for, 162–163
scaling by adding clones, 72–74 MySQL master-master, 161–162
scaling with data partitioning, 75–77 MySQL not supporting automatic, 160
scaling with functional partitioning, 74–75 NoSQL with automatic, 198
removing for high availability, 79
EC2. See Amazon EC2 (Elastic Compute Cloud) using load balancers with automatic, 106
EDA (event-driven architecture)
failure handling
currently in renaissance phase, 301–302 with load balancers, 105
definition of, 32, 295 for messaging platform, 284
direct worker queue interaction, 296–297 with MySQL replication, 159–162
event-based interaction, 297–301 with stateless web service machines, 139
request/response interaction, 296
traditional interactions vs., 32–33 fearless engineers, 334
edge-cache servers, 20–21 feature toggles, build and deployment, 339–340
80/20 rule, 352–353, 356–357 features
Elastic Cache, Amazon, 238–239
Elastic Compute Cloud. See Amazon EC2 80/20 rule for new, 352
(Elastic Compute Cloud) Amazon SQS, 288
Elasticsearch, 329–330, 347 RabbitMQ, 291
ELB. See Amazon ELB (Elastic Load Balancer) feedback
e-mail balancing schedule with, 356–357
asynchronous processing of, 250–254 continuous deployment pipeline, 339
class diagram, 59–60 Lean Startup methodology, 285
single responsibility for validation of, 62–63 making use of, 49
synchronous processing of, 247–250 ongoing code reviews, 355
releasing smaller chunks for customer, 356
386 Web Scalability for Startup Engineers
FIFO (First In First Out) G
ActiveMQ for messaging, 292
message queue as, 283 generic server metrics, reducing MTTR, 341
solving message ordering with, 278 geoDNS server, 19, 21–23
GET method
file storage
choosing deployment, 121 caching service responses, 146–148
managing, 93–96 challenges of sharding, 176–177
as possible single point of failure, 79 HTTP and resource-centric services,
using Amazon S3 for, 119
135–138
file system, scaling local cache, 235 HTTP session management, 88, 90–91
file-based caches, 235 GFS (Google File System), 96, 190–191
fine-grained locks, 143 github web hook, 338
fire-and-forget requests global audience, scalability for, 19–21
globally unique IDs, application-level sharding, 184
asynchronous example, 249–251 Google Maps API, 42–43
easier scalability and, 272 Google Trends, 292–293
message queue as, 256, 282 Grails framework, 42, 68
Firefox, 53–54 Grails in Action, 42
First In First Out. See FIFO (First In First Out) GridFS, in MongoDB, 96
flexibility group, auto-scaling, 115–116
Amazon SQS disadvantages, 288 group ID, ActiveMQ, 279, 294
of good architecture, 316
of RabbitMQ, 288–291 H
framework, front end, 111
front cache servers, 22–24 HA Proxy, 119–120
front-end layer Hadoop, 42
application architecture for, 28–30 Hadoop Distributed File System (HDFS), 96
building, 84–85 Hadoop in Action, 42
deployment examples, 117–121 HAProxy, 107–109
overview of, 84 hard drive caches, 208
summary, 121 hard drive speed, 9–10
front-end layer, scalability components hardware
auto-scaling, 114–116
caching, 113–114 isolation of services using rented, 13
DNS, 102–103 load balancers, 109–111
load balancers, 103–111 private data center hosting own, 119–121
overview of, 101–102 reverse proxy, 226–227
web servers, 111–113 upgrading for vertical scalability, 8–9
front-end layer, state HBase, 317
for files, 93–96 HDFS (Hadoop Distributed File System), 96
for HTTP sessions, 88–93 headers, HTTP, 211–217
other types of, 97–101 hexagonal architecture, 32
stateless vs. stateful services, 85–88 high availability
frontline layer, data center infrastructure, 22–24 building own file storage/delivery, 95
full page caching, 220–221 comparing in messaging platforms, 293
full table scans, 305 data stores for, 190–191
full text search, with inverted indexes, 326–328 definition of, 78
functional partitioning Elastic Load Balancer with, 106
with distributed locking, 98–100 eventual consistency increasing, 194
isolation of services using, 13–14 HAProxy with, 109
scaling with, 71, 74–75, 185–187 MySQL replication with, 159–160
function-centric web services, 131–135 software design for, 77–80
functions trading for consistency, 197–199
MySQL replication, 169 high cardinality fields, indexes, 309–310
sharding, 182 high coupling, 43, 46
Index 387
high value processing, message queues, 271 full text search using inverted, 326–327
Hollywood principle, IOC as, 69 item distribution in, 310–311
horizontal scalability key-value stores not supporting, 317
as lookup data structure, 305
Cassandra for, 203 properties of, 305–306
comparing in messaging platforms, 294 searching for data and, 304–305
data partitioning and. See data partitioning using job queue for search engine data, 329
infrastructure, messaging, 266–270
(sharding) inheritance, for repetitive code, 50
data stores for, 190–191 innovation, scaling engineering department, 359–361
deferring building of, 353 integration, continuous, 336
evolution to, 16–19 integration tests, 335
RabbitMQ for, 291 interaction rates, scaling for higher, 3–4
scaling data layer for. See data layer interfaces
stateless front-end web servers for, 111–112 dependencies of, 60
wide columnar data stores using, 317 in open-closed principle, 63–64
HTML (Hypertext Markup Language), 124, 217 Internet Information Services (IIS), 53–54
HTTP (Hypertext Transfer Protocol) interoperability, JMS/STOMP, 266
coding to contract and, 53–54 inverted indexes, 326–330
edge-cache servers and, 20 I/O (input/output)
managing sessions, 88–93 blocking, 248
REST services using, 135–137 as indexing overhead, 308
in single server set-up, 5–6 in MySQL replication, 158–159
testing with Jmeter, 335 nonblocking, 253
web applications initially built with, 124 vertical scalability improving, 8–9
HTTP-based caches IOC (inversion of control), software design
browser cache, 208, 218–219 principle, 68–71
caching headers, 211–217 IP (Internet Protocol) address, 5–6, 101–102
caching proxies, 219–220 isolation
caching rules of thumb, 239–244 in ACID transactions, 177
CDNs, 221–222 decoupling of producers/consumers,
between clients and web service, 213
object caches vs., 227 275–276
overview of, 210–211 evolution of services to, 11–13
reverse proxies, 220–221 message queues for failure, 274–275
scaling, 223–227 of queue workers, 268
SOAP scalability and, 134
types of, 217–218 J
HTTPS (HTTP over TLS Transport Layer
Security), REST services, 138 Java
hybrid applications, front-end layer, 85 ActiveMQ written in, 291–292
Hypertext Markup Language (HTML), 124, 217 distributed locking in, 99–100
overengineering in, 40
I using inversion of control, 68
idempotent consumer, 280 Java JVM, 91
IIS (Internet Information Services), 53–54 JavaScript, 228–229
incremental change, for inefficient processes, 49 Jenkins, 337–338
indexes Jmeter, 335, 339
JMS (Java Message Service), 266, 292
adding overhead, 308–309 JMX (Java Management Extensions) protocol,
binary search algorithm in, 306–307
book, 305–306 ActiveMQ, 292
compound (composite), 311 job queue, search engines, 329
estimating field cardinality for, 308–310 JSON (JavaScript Object Notation)-based REST
full table scans vs., 305
services, 133, 136
388 Web Scalability for Startup Engineers
K locks
managing server state, 98–99
Kafka topic partitioning, 279 preventing deadlocks, 142
keep-alive header, 212 resource. See resource locks
keys
logging
accessing value in web storage, 229 automating log aggregation, 345–347
client-side caches and, 228 custom routing rules for, 264
key-value stores log-forwarding agents, 346
client-side caches as, 228–229
distributed object caches as, 232 Loggy, hosted log-processing service, 346
NoSQL, 317 Logstash, 346–347
Kibana web interface, Logstash, 347 longevity, affecting cache hit ratio, 209
loose coupling
L
avoiding unnecessary coupling, 47
language models of, 47–48
function-centric web services and, 132 overview of, 43–44
selecting for front end, 111 promoting, 44–46
low coupling, 44, 46
latency low value processing, message queue, 271
Amazon Route 53 routing and, 102 LRU (Least Recently Used) algorithm, 224, 233
dictated by weakest link in call stack, 296
eventually consistent stores and, 197 M
hosting own hardware and, 119
shared lock management increasing, 99 maintenance
cloud service provider costs for, 17
LbaaS load balancer, Open Stack, 111 data stores for file storage reducing cost of, 96
Lean Startup methodology, 285, 356 higher costs for more code, 50
Least Recently Used (LRU) algorithm, 224, 233 load balancers for hidden server, 104
links, references, 374–377 master-master deployment for long-lasting,
Linux OS file caches, 208 161–163
Load Balancer, Azure, 111, 140 message queues and performing, 274
load balancers stateless web services for easy, 139
benefits of, 104–106 manual deployment, vs. automated, 335–336
benefits of stateless web service machines, manual testing, vs. automated, 333–334
mapping
139–140
deploying private data center with, keeping data in separate database, 179–182
modulo-based issues, 178
119–120 multidatabase sharded solution, 182
DNS round-robin–based, 103–104 scaling with data partitioning using, 76–77
as front line of data center infrastructure, sharding key to server number, 172–173
MapReduce, 42, 190–191
22–24 master server
as front-end layer component, 101 MySQL master-master replication, 161–164
handling session state with, 92–93 MySQL replication, 169–170
hardware-based, 109–111 MySQL ring replication, 164–165
as hosted service, 106–107 replicating sharding key mappings, 180–182
in MySQL replication with multiple master-master replication, MySQL
adding difficulty by using, 166
slaves, 158 challenges of, 166–170
self-managed software-based, 107–109 deploying, 160–163
load testing, Jmeter, 335 not viable for scalability, 163–164
local cache master-slave topology, MySQL
caching in different layers of stack, 240 object caches allowing for replication, 237
implementing, 230–232 recovering from failure, 160–161
scaling web server, 235 replication, 157–159
local device storage, client-side cache, replication, scaling cache cluster, 237–238
228–229
local simplicity, in software design, 39–40
lock contention, 9–10
Index 389
replication challenges, 166–170 example of, 250–254
as single source of truth semantics, 166 front-end sending events to, 29
max-age response, Cache-Control HTTP header, 214 message broker, 259–260
Memcached message consumers, 260–265
distributed locking, 100–101 message producers, 258–259
distributed object caches, 232–234 messaging infrastructure, 266–270
scaling distributed object caches, 235–236 messaging protocols, 265–266
memory overview of, 256–257
cache servers using LRU for limits of, 233 removing resource locking in web
implementing local application caches, 230
as indexing overhead, 308 services, 142
needs of search engines, 328 message requeueing problem, 280
message brokers message-oriented middleware (MOM), 259–260
in ActiveMQ, 291–292 messaging infrastructure, 266–270
creating custom routing rules, 264–265 messaging platforms
isolating failures, 274–275
in message queue-based processing, 259–260, ActiveMQ, 291–292
Amazon SQS, 285–288
273–274 final comparison notes on, 292–294
metrics, 267–270 overview of, 284–285
in RabbitMQ, 290 RabbitMQ, 288–291
scaling horizontally, 268 messaging protocols, 265–266, 288
in system infrastructure, 267 metatags, avoiding cache-related HTML, 217
message consumers metrics, reducing MTTR, 341–343
benefits of message queues, 272–273 Microsoft Azure. See Azure
custom routing rules for, 264–265 minimal viable product (MVP) development, Lean
decoupling producers from, 260, 274–275, 283 Startup, 285
delivering messages to, 256–257 mobile clients
direct worker queue interaction, 262, 297 developing mobile APIs, 124
event-driven interaction, 297, 299 scaling front end with browser cache, 113–114
idempotent, 280–281 single-page application for devices, 229
message ordering problem, 276–279 mocks, startup development, 357
messaging infrastructure for, 268–269 modeling data
overview of, 260–262 NoSQL, 313–318
publish/subscribe method, 263–264 overview of, 313
message groups, ActiveMQ, 279, 292 wide column storage example, 318–325
message of death, handling, 284 modules
message ordering problem avoiding unnecessary coupling in, 47
causing race conditions, 281 class diagrams of, 59–60
overview of, 276–278 drawing diagrams, 60–61
partial message ordering, 279 loose coupling for, 44–46
solving, 278–279 single responsibility for, 62
message producers modulo-based mapping, and sharding, 178
decoupling consumers from, 260, MOM (message-oriented middleware), 259–260
MongoDB
275–276, 283 as document-oriented data store, 318
in direct worker queue interactions, 297 fast recovery for availability, 197–199
in event-driven interactions, 297, 299 scaling file storage/delivery with, 96
isolating failures and, 274–275 monitoring
overview of, 258–259 automating, 340–345
message publishing, 258, 274–276 installing agent on each server, 342
message queues tools, 340
anti-patterns, 282–284 monolithic application with web service, 124–127,
benefits of, 270–276 130–131
caching high up call stack, 240 MTTR (mean time to recovery), reducing
challenges of, 276–282 in monitoring and alerting, 340–345
as data center infrastructure, 25 in self-healing, 80
390 Web Scalability for Startup Engineers
multidatabase sharded solution, 181–182, 183 O
multilayer architecture, 31
multiple load balancers, 109–110 OASIS (Organization for the Advancement of
multiple reverse proxy servers, 225–226 Structured Information Standards), AMQP, 265
must-revalidate response, Cache-Control HTTP
object cache servers, 25, 114
header, 214 object caches
MVC frameworks, 65, 126
MVP (minimal viable product) development, Lean caching application objects, 227–228
caching rules of thumb, 239–244
Startup, 285 client-side, 228–230
MySQL, as most popular database, 156 co-located with code, 230–232
MySQL, scaling distributed object, 232–234
scaling, 234–239
overview of, 156 size affecting cache hit ratio, 209
replication, 156–166 object-clustering, Java JVM session storage, 91
replication challenges, 166–170 object-oriented languages, coupling in, 44–45, 47
vertical scalability issues, 9–10 open-closed principle, 63–68
operating system
N metrics reducing MTTR, 341
as multilayered architecture, 31
NASA (National Aeronautics and Space operations, scalability of, 332
Administration), 50 optimistic concurrency control, 142
OR conditions, full text search, 328
Netscaler, Citrix, 109–111 Organization for the Advancement of Structured
networks Information Standards (OASIS), AMQP, 265
organizational scalability, constraints, 4
HTTP proxy server in local, 219–220 overengineering
improving throughput for vertical scalability, 9 API-first for web services risking, 128–129
Nginx avoiding for simplicity, 40–41
private data center deployment using, 119–120 designing for scale without, 71
reverse proxy, 224 overhead, added by indexes, 308–309
as software-based load-balancer, 107–109 overtime, and productivity, 347–348
superior performance of, 226
no-cache response, Cache-Control HTTP header, 214 P
Node.js, 112, 271
nodes pair programming, 354
in Cassandra topology, 80, 199–201 parallel back-end processing, 272–273
in MongoDB failure handling, 198–199 partial message ordering guarantee, 279
share-nothing principle for, 76 partition tolerance, CAP theorem, 191–192
nonblocking I/O, 253 partitioning. See data partitioning (sharding)
noncacheable content, HTTP headers of, 216–217 partitioning, topic, 279
normalization pattern matching, customizing routing rules for, 264
NoSQL denormalization, 316 performance
in relational data model, 314–315
NoSQL data stores asynchronous processing and, 253–254
data as index in, 312–313 caching to improve, 242–243
in data layer of data center, 25 increasing. See caching
data modeling, 313–317 synchronous processing and, 249–250
dedicated search engine for, 328–330 persistent data, and stateless web services, 140–141
defined, 190–191 pipes, Unix command-line program, 47–48
as most commonly used, 317–318 plugins, inversion of control principles for, 70
NoSQL data stores, scaling poison message handling, 284
Cassandra topology, 199–204 policies, scaling, 115
faster recovery for availability, 197–199 POST method, HTTP, 135–136, 211
overview of, 189–191 pragmatic approach, web service design, 130–131
rise of eventual consistency, 192–197 presentation layer, and web services, 124–127,
no-store response, Cache-Control HTTP header, 214 130–131
no-transform response, Cache-Control HTTP
header, 214
Index 391
primary node failure, MongoDB, 198–199 race conditions, 98–99, 281
prioritizing Rackspace
tasks to manage scope, 351–354 auto-scaling with, 115
where to start caching, 242–243 hosting MySQL with Cloud Database, 170
private files, 93, 95 RAID (Redundant Array of Independent Disks),
private response, Cache-Control HTTP header, 213 8, 95–96
procedures, scaling engineering department, Rails, 68
359–361 RAM (random access memory), 7–10
processes, wasting time on inefficient, 49 random access, in message queue, 283
productivity, scaling. See automation; yourself, scaling random access I/O, 8
products, building teams around, 358 random access memory (RAM), 7–10
protocols, messaging, 265–266 random order, solving message ordering, 278
providers rapid learning, Lean Startup, 356
auto-scaling using hosting, 115 RDS (Relational Database Service), Amazon, 170
coding to contract to decouple clients from, read-only statements, MySQL replication, 158
reads
51–54 adding replica servers, 185–186
configuring/scalability of CDN, 221–223 eventual consistency conflicts, 194–196
proxy (intermediate), HTTP-based caches, 210 MySQL replication, 166, 186–188
proxy servers, 223 MySQL replication timing issues, 169
pt-table-checksum, MySQL replication issues, 170 trading high availability for consistency,
pt-table-sync, MySQL replication issues, 170
public files, 93–94 197–199
public response, Cache-Control HTTP header, 214 read-through caches
publishing, message, 258, 274–276
publish/subscribe queue model, routing, 263–264 cache-aside caches vs., 227
PUT method, HTTP, 135–136, 211 caching proxies as, 219–220
HTTP-based caches as, 210–212
Q Redis, distributed object caches, 232–234
redundancy, for high availability, 79–80
queries Redundant Array of Independent Disks (RAID),
in Cassandra, 202 8, 95–96
designing NoSQL data model, 314–316 refactoring
executing across shards, 176–177 80/20 rule for, 353
optimizing for kinds of, 325 copy-paste programming and, 50
wide column storage example of, 318–321 single responsibility principle for, 61–63
references, for this book
queue workers books, 364–366
allowing self-healing of system, 274–275 links, 374–377
isolating, 268 talks, 373–374
scalability by adding parallel, 272–273 white papers, 366–373
in system infrastructure, 267–268 regression testing, 333
reinventing the wheel, avoiding wasted time, 49
queue-based interaction, and EDA, 296–297 relational data model, 313–315
queues, in Cassandra, 204 Relational Database Service (RDS), Amazon, 170
quorum consistency, 196–197, 203 relay log, MySQL replication, 158, 161
release cycle
R reducing size of each, 356–357
wasting time on inefficient processes in, 49
RabbitMQ remote servers, message queue interaction with, 271
comparing messaging platforms, 286 replica servers, adding, 185–186
flexible routing rules in, 264 replica sets, MongoDB, 198–199
message ordering problem, 280 replication
messaging infrastructure, 269 Cassandra, 201–202
messaging protocols for, 265–266 local caches not using, 231–232
overview of, 288–291 scaling object caches with, 237–238
poison message handling in, 284
392 Web Scalability for Startup Engineers
replication, MySQL methods, 262–265
applicable to other data stores, 170 RabbitMQ advanced message, 288–290
challenges of, 166–170 rows, Cassandra table, 319
defined, 156–157 rules
handling slave failures, 160 creating indexes, 310
master-master, 160–164 custom routing, 264–265
master-slave, 157–158 rules, caching
overview of, 156 cache invalidation is difficult, 243–244
ring, 164–165 caching high up call stack, 239–240
scaling with, 186–187 reusing cache among users, 240–242
summary of, 166 where to start caching, 242–243
using multiple slaves, 158–159 run-time environments, function-centric web
services, 132
replication lag, 165–166, 169
request headers, HTTP-based caches, 212–213 S
request/response interaction, and EDA, 296
requirements, scalability, 3–4 S3 (Simple Storage Service), 93–95, 117–119
resource intensive work, with message queues, 271 scalability
resource locality, CDNs, 15
resource locks, 98–99, 141–143 ActiveMQ, 291–292
resource management, 105–106 agile teams, 357–361
resource-centric web services, 134–138 Amazon SQS, 286–288
response headers, HTTP-based caches, 212–214 automation. See automation
REST (Representational State Transfer) web services concept of, 2–4
definition of, 3
between front-end and web services, 25 engineering department, 349
JSON-based, 133 local application vs. distributed object
as resource-centric, 135–138
REST API, RabbitMQ, 289–290 caches, 232–234
REST web services, scaling message queues for easier, 272–273
caching service responses, 146–149 of object caches, 234–237
cluster of, 220–221 operations, 332
functional partitioning, 150–153 RabbitMQ, 291
keeping service machines stateless, 139–146 as software design principle, 71–77
overview of, 138 startup environment and, 2
return channels, avoiding message queue, 282–283 your own impact, 349
reuse of cached objects, 240–242 for yourself. See yourself, scaling
reuse of code schedule, influencing, 355–357
avoid reinventing the wheel, 49 schema, NoSQL data model, 314
open-closed principle for, 64–65 scope
single responsibility principle for, 61–63 influencing by prioritizing tasks, 350–354
reuse of tools, 355 as project management lever, 349–350
revalidation, Cache-Control HTTP header, 214 Search, Azure, 329
reverse proxies search engines
caching high up call stack, 240 introduction to, 326–328
as front-end layer component, 101 memory needs of, 328
as HTTP-based cache, 220–221 overview of, 326
managing scalability of, 223–227 using dedicated, 328–330
scaling front end with caching, 113 searching for data
as software-based load-balancers, 107–109 introduction to indexing. See indexes
Riak, 317 modeling data, 313–325
ring replication, MySQL, 164–170 overview of, 304
round-robin DNS service, 18, 19, 103–104 search engines, 326–330
Route 53 service, Amazon, 101–103, 117–119 summary, 330
routing Secure Sockets Layer (SSL)
ActiveMQ and, 291, 292 overview of, 220
Amazon SQS and, 288 termination. See SSL termination
Index 393
security share-nothing principle
REST services vs. SOAP, 138 advantages of sharding, 175–176
stateless web services for, 140–142 scaling distributed object caches, 236
scaling with data partitioning using, 76–77
Selenium, end-to-end tests, 335, 339
self-healing Simple Logging Facade for Java (SLF4J), loose
coupling in, 48
auto-scaling similar to, 116
in Cassandra, 202–203 Simple Mail Transfer Protocol (SMTP), 59–60
designing software for, 77–80 Simple Object Access Protocol. See SOAP
message queues promoting, 274–275
self-managed software-based load-balancers, (Simple Object Access Protocol)
107–109 Simple Queue Service. See Amazon SQS
separation of concerns, API-first design for web
services, 128 (Simple Queue Service)
sequential I/O, 8 Simple Storage Service (S3), 93–95, 117–119
server number simplicity
mapping data in separate database,
hiding complexity/building abstractions,
179–182 38–40
mapping sharding key to, 179
servers learning from software design, 42–43
adding to sharded deployments, 178 overengineering reducing, 40–41
automating configuration of, 338 single responsibility increasing, 61–63
horizontal scalability with multiple, 16–19 as software design principle, 38
hosting on hardware vs. virtual, 119 with test-driven development, 41–42
isolating roles in functional partitioning, single points of failure, 79, 106
single responsibility, 61–63, 68
74–75 single-page applications. See SPAs
isolating services to separate, 11–14 (single-page applications)
managing state, 97–101 single-server configuration
reducing with CDNs, 15 adding vertical scalability to, 7–11
stateless vs. stateful, 85–88 evolution from, 5–7
uploading user-generated content to, 93–96 isolation of services for, 11–14
Service Bus Queues, scalability limits of, 269 scalability limitations of, 7
service calls, in Amazon SQS, 288 scaling by adding copies of same thing,
service level agreement (SLA), 343
service-oriented architecture. See SOA 184–185
(service-oriented architecture) size
services
adding abstraction for, 40 of cache affecting cache hit ratio, 209
building teams around, 358 data set. See data set size
isolating to separate servers, 11–14 scalability of reverse proxy, 224
in request/response interactions, 299 SLA (service level agreement), 343
scaling with functional partitioning, 74–75 slave servers, MySQL replication. See also
in web services layer, 30–34 master-slave topology, MySQL
sessions, managing HTTP, 88–93 breaking data consistency, 169–170
setters, unnecessary coupling and, 47 multiple, 158–159
sharding. See data partitioning (sharding) overview of, 157–158
sharding key rebuilding of failed, 160
choosing, 171–174 returning stale data, 169
definition of, 171 scaling reads, 166
implementing, 188–189 SLF4J (Simple Logging Facade for Java), loose
mapping to server number, 178 coupling in, 48
shared hosting, 6, 7 SMTP (Simple Mail Transfer Protocol), 59–60
shared libraries, 50 SOA (service-oriented architecture)
shared memory, 230 definition of, 30
shared object cache, 141–142 RabbitMQ message routing as, 290
scaling with functional partitioning in, 74–75
SOAP (Simple Object Access Protocol)
as function-centric, 132
integration flow for, 132–133
394 Web Scalability for Startup Engineers
SOAP (cont.) state, managing
interoperability/usability of, 133–134 files, 93–96
over HTTP, 25 HTTP sessions, 88–93
REST vs., 137–138 keeping web service machines stateless,
scalability issues of, 134 139–146
SOA vs., 30 other types of state, 97–101
stateless vs. stateful services and, 85–88
software design principles
coding to contract, 51–54 stateful services, stateless vs., 85–88
dependency injection, 65–68 stateless services
design for scale, 71–77
design for self-healing, 77–80 defined, 73
don’t repeat yourself, 48–51 queue workers as, 268
draw diagrams, 54–61 scaling by adding clones to, 73
inversion of control, 68–71 stateful vs., 85–88
loose coupling, 43–48 web servers as, 268
open-closed principle, 63–65 web service machines as, 139–146, 268
overview of, 38 static files, 215–216, 222–223
simplicity, 38–43 sticky sessions, 92–93, 109
single responsibility, 61–63 STOMP (Streaming Text-Oriented Messaging
summary, 81 Protocol), 265–266, 288
streaming logs, to centralized log service, 345
software-based load-balancers, self-managed, Streaming Text-Oriented Messaging Protocol
107–109 (STOMP), 265–266, 288
subscription methods, message consumers, 262
solid-state drives. See SSDs (solid-state drives) subsets, in sharding, 171, 175
Solr, as dedicated search engine, 329 Symfony, and inversion of control, 68
sorting algorithm, open-closed principle, 63–64 synchronization
SPAs (single-page applications) consistent data stores supporting, 196
local application caches not using, 231–232
building front-end layer as, 84 replication in MySQL as, 157
building front-end layer as hybrid synchronous invocation, as temporal coupling, 296
synchronous processing
application, 85 affecting perceived performance, 249–250
local device storage for, 229 asynchronous processing vs., 246
scaling front end with browser cache, 113–114 example of, 247–249
Sphinx, as dedicated search engine, 329 shopping analogy for, 254–255
spikes
ActiveMQ message, 292 T
message queues evening out traffic, 273–274
Spring framework, 68 tables, Cassandra, 201, 319–323
Spring Recipes, 42 talks, references for, 373–374
SpringSource, 292 tasks
SQL Database Elastic Scale, Azure, 184
SQS. See Amazon SQS (Simple Queue Service) 80/20 rule for, 353
Squid, as open-source reverse proxy, 224 delegating responsibilities and, 354
SSDs (solid-state drives) prioritizing to manage scope, 351–354
building own file storage/delivery, 96 TCP socket, avoid treating message queue as,
improving access I/O for vertical scalability, 8 282–283
scaling reverse proxies vertically, 227 TCP/IP programming stack, 31
SSL (Secure Sockets Layer), 220 TDD (test-driven development), 40–41
SSL termination technologies, application architecture supporting,
benefits of Elastic Load Balancer, 106 34–35
defined, 106 templates, 29
HAProxy load balancer supporting, 109 temporal coupling, 296–297
stale, cached objects as, 214
startups, high failure rate of, 71
Index 395
Teracotta, JVM session storage, 91 uniform resource locators. See URLs (uniform
test-driven development (TDD), 40–41 resource locators)
testing, automated, 333–335
third-party services unit tests, 334, 338–339
Unix command-line program, loose coupling,
application architecture, 35
content delivery network, 13–16 47–48
data centers, 24 updates
Datadog monitoring service, 344
deploying private data center, 119–120 avoiding message queue, 283
distributed file storage, 96 breaking data consistency in MySQL
front end unaware of, 29
horizontal scaling, 17–19 replication, 169
hosting DNS, 101 denormalized data model issues, 316
integration between web applications, 25 stateless web service machines and, 139
reducing workload/increasing costs, 355 validating data model use cases, 324–325
scaling HTTP caches, 223–227 upgrades, 7, 8
service level agreement monitoring, 343 uploading, user-generated content to your server,
sharing application logs with, 346–347 93–96
time URLs (uniform resource locators)
avoiding overtime, 347–349 bundling CSS and JS files under unique, 216
influencing schedule, 355–357 distributed file storage using S3, 94
MySQL replication issues, 169 downloading files using, 93
as project management lever, 349–350 REST services using, 135–137
time to discover, MTTR, 340 use cases
time to investigate, MTTR, 341 API-first design for web services, 127–129
Time to Live. See TTL (Time to Live) expiration drawing diagrams, 57–58
time to respond, MTTR, 340–341 file storage, 93
tokens, inverted index structure, 327–328 message queues, 271
Tomcat, 53–54 preventing repetition with most
traditional multipage web applications, 84, 85
traffic common, 51
benefits of stateless web service stateless web services, 141–146
validating data model against known,
machines, 139
in CDNs, 15 324–325
distribution in single server, 5–6 web server, 111–112
message queues evening out spikes in, wide column storage example, 318–322
user ID. See account ID (user ID)
273–274 user interface, front end application as, 29
scaling for global audience, 19–21 user-generated content, uploading to servers, 93
single-server scalability limits, 7 users
TTL (Time to Live) expiration reuse of cached object, 240–242
defined, 209 sharding based on, 172–174
keeping web service machines stateless, 140 validating data model use cases, 324
max-age response, Cache-Control HTTP
V
header, 214
overcoming cache invalidation by setting value
accessing in web storage, 229
short, 243–244 client-side caches and, 228
scalability of reverse proxy, 224
Varnish, 53–54, 224
U Vary HTTP header, 215
vertical scalability
Ultima Online, 171
UML (Unified Modeling Language), 60 cost issues of, 9–10
Unified Modeling Language (UML), 60 definition of, 8
methods, 8–9
reverse proxies, 227
system architecture unaffected by, 10–11
396 Web Scalability for Startup Engineers
virtual servers, 8, 119 web session scope, HTTP sessions, 89
vision, validating for success, 351 web storage
VPS (virtual private server), 6, 7, 13
with JavaScript code, 228–229
W scaling client-side caches for, 233–234
speeding up application time, 229–230
web application layer using as cache vs. reliable data store, 229
data center infrastructure, 24 white paper references, 366–373
managing HTTP sessions, 91–92 wide columnar data stores, 317, 318–325
master-master replication and, 163 writes
sharding in, 174, 176–178, 188 Cassandra optimization for, 202,
web application servers, 240 322–323
web applications, building front-end layer, 84–85 cost of deletes in Cassandra, 203–204
web browsers, decoupling from providers, 53–54 eventual consistency conflicts, 194–196
web flows, separate from business logic, 29 master-master replication and, 163–164
web servers not scaling using replication, 166
scaling with, 186–187
benefits of load balancers, 104–106 trading high availability for consistency,
decoupling from clients, 53–54
for front-end scalability, 111–113 197–199
as front-end layer component, 101 ws-* specifications, SOAP, 133–134
for HTTP session with cookies, 88–89
keeping stateless, 101, 139 X
local cache on, 230–232
reverse proxy reducing load on, 220–221 XML-RPC (Extensible Markup Language-Remote
scaling by adding clones, 72–74 Procedure Call), 132
scaling local caches, 235
web services Y
application architecture for, 30–34
in data center infrastructure, 24–25 yourself, scaling
designing, overview, 124 influencing cost, 354–355
designing API-first, 127–129 influencing schedule, 355–357
designing as alternative presentation layer, influencing scope, 350–354
overtime and, 347–349
124–127 overview of, 347
designing with pragmatic approach, 130–131 self-management, 349–350
function-centric, 131–134 your own impact, 349
overview of, 124
resource-centric, 134–138 Z
REST. See REST (Representational State
zero-downtime
Transfer) web services backups, 159
scaling REST. See REST web services, scaling updates, 139
scaling with functional partitioning, 74–75
summary, 153 Zookeeper, 99–100, 142