- Analysing Performance, steps involved, and ending. - Linux OS performance metrics. Lot of monitoring tools are built on top of these metrics. Example : *System is slow* 1. start with command top for processes and cpu 2. Check Disk io (iostat), and network (sar) What to do in such a case : Quantify the problem, is it latency etc. Check system resources with methodologies, run through the checklist. ## Methodoligies - provide giudance in choosing the performance tools. Starting point, process and the ending point. 1. AntiPattern: People tend to run commands they know, not trying to understand what the problem and attacking in solving that instead. (Drunk Man Anti Method) Randomly throwing everything at the problem. 2. Maybe network, firewall etc ### Actual Methodoligies - Problem Statement Method - Why do you think it has a performance problem, is this a new problem or has been there for some time? Something changed recently? Can be expressed in terms of latency, run-time? - Workload Characterization Method - Who is the causing the load? Why is the load called? What is the load? How the load changed over time? - Solve some issues - USE Method - USE : Utilization, Saturation, Error - Functional diagram of the system(listing all componenets of the system), and for every resource check **utilization**(busy time), **saturation**(queue length/time), **errors**(easy to interprate). - Current tools might not look everywhere, so this method poses question before the ansers, look at place which are sometimes missed. - CPU Analysis - Process get deadlocked/blocked, at some point (paging, context switching, network io) - CPU Profile Method - Flame graph #### Tools Categorised: | Observability Tools : Watch Activity | Benchmarking : Load Test | Tuning : Changing system parameters | Static : Chainging system configs. Observability Tools : LinuxInternal.md Hint : Get a functional diagram of the environement, makes easier to create a check list. ---- Example 2: Application Latency is higher. USE METHOD: 1. **top** command - Check cpu summary, process/kernel time, cpu utilization (if it is 100 percent or not). 2. CPU utilization again with **vmstat** to see paterns. Check memory, if there is enough left and is not leaninig towards saturation point. 3. **mpstat** to check if maxing out any cpu ``` Utilization and saturation metrics: swapping not too much, enough memory left, cpu are not overloaded, cpu time for kernel/application is not too much, r is not a lot more than cpu present. CPU saturation/utiliation is flexible in case of linux, kernel manages/moves things around, interrups threads etc if needed. same is not the case with io. ``` 4. Check Disk IO utiliation. **iostat**. util column: more than 60 percent utilization might the problem. 5. Check Network IO utilization **sar -n DEV 1**. 6. pidstat for process wise usage of. ----- Category of problem : Sluggish slow server Questions to consider : what is load and when it is high