# Overall Principles ## Grasping Key Tech Fundamentals - Describing distributed systems - Core networking fundamentals - Applying HTTP/HTTPS - Understanding SRE principles ## Keeping in Compliance - follow spirit and letter of "the law" - Compliance with what? - Getting help with compliance - Relevant products and services ## Annotating Resources Properly - Understanding annotation options - Applying security marks - Working with labels - Implementing networking tags - Choosing the right annotation ## Managing Quotas & Costs - Working with quota limits - Cost optimization principles - Best practices (overall, compute, storage and data analysis) --- ## Key Fundamentals Distributed System - group of servers working together as to appear as a single server to end user - Scale Horizontally - increase capacity by adding more servers that work together - Scale Vertically - Increasing capacity by adding more memory or using a faster CPU - Sharding - Splitting server into multiple servers, a.k.a. "partitioning" Networking - be familiar with 7-layer OSI model - 7 Layer OSI model - Application - End user layer (human comp interaction): HTTP, FTP, IRC, SSH, DNS - Presentation - Syntax layer: SSL, SSH, IMAP, FTP, MPEG, JPEG - Session - Sync and send to port: APIs, Sockets, WinSock - Transport - End to end Connections: TCP, UDP - Network - Packets: IP, ICMP, IPSec, IGMP - Data Link - Frames: Ethernet, PPP, Switch, Bridge - Physical - coax, fiber, wireless, hubs, repeaters - TCP/IP - primary way data gets around the Internet - Handshaking with syn/ack - Addressing with IPv4 and IPv6 - Public Internet and private RFC1918 addressing - SSL/TLS - encrypted comms - SSH - access disks - Ports - 80 - HTTP - 22 - SSH - 53 - DNS - 443 - HTTPS - 25 - SMTP - 3306 - MySQL Applying HTTP/HTTPS - works on L7 (Application Layer) - Understand your resources (URL/URI) and how parameters are applied - Know verbs: GET, POST, PUT, DELETE & PATCH, OPTIONS, TRACE, CONNECT - Have firm grasp of caching: headers and locations (browsers, proxies, CDN, memory cache) - Be familiar with CORS - HTTP/HTTPS status codes - 100 Information - 100 - Continue - 101 - Switching protocol - 200 Successful response - 200 - Okay - 201 - Create - 202 - Accepted - 204 - No content - 206 - Partial content - 300 Redirection - 301 - Moved permananently - 304 - Not modified (caching) - 307 - Temporary redirect - 308 - Permanent redirect - 400 Client Errors - 400 - Bad request - 401 - Unauthorized - 403 - Forbidden - 408 - Request timeout - 429 - Too many requests - 500 Server Error - 500 - Internal server error - 501 - No implemented - 502 - Bad gateway - 503 - Service unavailable / quota exceeded - 504 - Gateway timeout - 511 - Network authentication required Understanding SRE Principles - What happens when a software engineer is tasked with what used to be called operations (Ben Traynor ~ 2003) - SLI - Service Level Indicator (carefully defined quantitative measure of level of service provided over time) - Request latency - how long to return a response to a request - Failure rate - fraction of all rates received - Batch throughput - proportion of time that data processing rate > threshold set - SLO - Service Level Objective (specify target level for reliability of service) - 100% is unrealistic, more expensive, often not necessary from users and best to find where they don't notice - difference, more resources focused on value add of service - SLA - contractual obligation - includes consequences of meeting or missing SLOs it contains - SLI - drives - SLO - informs - SLA --- ## Compliance Compliance with what - Legistation - targeted areas (health regs, privacy, children's privacy, ownership) - Commercial - protect sensitive data, credit cards / PII - Industry certifications - ensure following health, safety, and environmental regulations - Audits - create necessary structure to allow for 3rd-party audits Getting help with compliance - Visit the Compliance Center - sortable by region, industry, and focus area - General Data Protection Regulations (GDPR) - continue to have major impact on web services around the world - BAA - Google business association agreement (customer must request BAA from account manager for HIPAA compliance) Relevant products and services - 2-factor authentication - Cloud Security Command Center (CSCC) - Cloud IAM (global across all Google Cloud) - Cloud Logging - Cloud DLP (de-identification routines to protect PII) - Cloud Monitoring (surface compliance missteps / alerts in real time) --- ## Annotations Understanding annotations - Security Marks - assigned and utilized through Cloud Security Command Center (CSCC) - Labels - key-value pairs that help you organize cloud resources - Network tags - applied to VM instances used for routing traffic to/fro Applying security marks - Adds business context to assets for compliance - Enhanced security focused insights into resources - Unique to CSCC - Set at org, project, or individually - Works with labels and network tags Working with labels - Key-value pairs supported by wide range or GCP resources - Used for many scenarios - Identify individual teams or cost center resources - Distinguish deployment environments - Cost allocation and billing breakdowns - Monitor resource groups for metadata - Labels to projects, but NOT folders Implementing network tags - Control traffic to/from VM instances - Identify VM instances subject to firewall rules and network routes - Use tags as source and destination values in firewall rules - Identify instances on a certain route - Configured with gcloud, console, or API Choosing right annotation - Need to group/classify for compliance? - Yes : use Security Marks - No : Need billing breakdown? - Yes : use Labels - No : Need to manage network traffic to/from VMs? - Yes : use Network Tags --- ## Managing Quotas & Costs Working within quota limits - restrict how much of a shared GCP resource you can use - Not to be confused with fixed contstraints which cannot be increased or decreased (i.e. max file siz, database schema limitis) - Two types of quotas: - Rate quotas - limit number of API or service requests - Allocation quotas - restrict the resource available at any one time - Limits are specific to your org - Add your own limits to impose spending limits - Exceeded quotas can generat quota error and 503 status for HTTP requests Cost optimization principles - Understand the total cost of ownership (TCO) - Commonly misunderstood when moving from on-prem (CapEx) model to cloud-based (OpEx) - Organize costs in relation to business needs - Maximize value of all expenses while eliminating waste - Implement standardized processes at the start Best practices: use cost management tools - Organize and Structure - set up folders, projects, and use labels to structure costs in relation to business needs - Billing Reports - view costs and analyze trends and filter as needed - Custom dashboards - can also export to BigQuery, then visualize in Cloud Data Studio - Compute - pay for the compute you need - Identify idle VMs - use Idle VM recommender service to identify inactive VMs - Snapshot them before deleting - Stop without deleting - Start/stop VMs automatically or via Cloud Functions - Create custom VMs with right size CPUs and memory - Make the most of preemptible/spot VMs (often is an option - consider it for exam) - Cloud Storage - ways to keep more of your company's hard-earned money - Choose right storage class: nearline 30, coldline 90, archive - Modify storage class as needed with lifecycle policies - Deduplicate data wherever possible (i.e. Cloud Dataflow) - Choose multi-region rather than single region buckets whewre viable - Set object versioning policies to keep copies down (i.e. delete oldest after 2 versions) - Keep BigQuery from BigCosts - Limit query costs with the maximum bytes billed setting - Partition tables based on ingestion time, data, timestamp or integer range column - Switch from on-demand to flat rate pricing to process unlimited bytes for fixed predictable cost - Combine Flex Slots (like preemptible) with annual and monthly commitments (blended)