How Does the C thread Using CPU? (Explained)

How Does the C thread Using CPU? (Explained)

 Does the c thread using CPU?. Even though today’s multi-threaded / multi-core systems require interpretation of these figures, simple CPU metrics like user/system/idle/io-wait are still frequently employed. Operating system measurements of “idle” cannot be directly converted into available CPU resources, making capacity planning a more challenging issue.

Background

When a processor only had a single core capable of supporting a single thread, the operating system’s CPU utilization report provided information on the processor’s real resource usage (and resource availability). In such settings, CPU usage increases linearly as workload increases.

CPUs with many cores: 1 processor = two or more cores

Each processing core in multi-core CPUs, which have a single processor with two or more cores, has its arithmetic and logic unit, floating point unit, set of registers, pipeline, and cache to some extent.

Multi-core CPUs, however, also share certain resources among the cores (e.g. L3-Cache, memory controller).

CPUs/cores with simultaneous multi-threading: 2 or more threads per processor or core (aka “Hyper-Threading”, “Chip Multi-threading”)

One physical core’s hardware resources are split across many threads. At the very least, each thread has a unique set of registers.

The majority of the core’s resources, including the floating point unit, arithmetic, and logic unit, and cache, are shared by all threads.

These threads naturally compete for processing power and stall if the targeted units are already active.

What are the advantages of sharing resources?

Sharing resources can boost throughput and efficiency by keeping a core’s processing units active. For instance, hyper-threading can minimize or conceal memory access stalls (cache misses).

The current thread is suspended and the next runnable thread is restarted to carry on with execution rather than wasting numerous cycles waiting for data to be received from the main memory.

What drawbacks are there?

  • Standard tools’ reporting of CPU time accounting measures (sys/usr/idle) does not take the implications of resource sharing between hardware threads into account.
  • It is impossible to accurately calculate available computer resources and measure idle time.

Idle does not represent how much additional work the CPU is capable of performing.

Example: 

Assume there are 4 threads per CPU core. Currently, this core is planned to run 2 (single-threaded) processes, which have already consumed all of the shared compute resources (ALU, FPU, Cache, Memory bandwidth, etc.) of the core.

Since two logical processors (hardware threads) appear to be entirely idle, commonly used performance tools would nevertheless claim that they are (at least) 50% idle.

The operating system would require detailed utilization information of all shared core processing units (ALU, FPU, Cache, Memory bandwidth, etc.) as well as knowledge of the characteristics of the workload to be added  to accurately estimate how much work can be added until the system approaches full saturation.

How Does the C thread Using CPU? (Explained)
 

Workload measurements with SAP ABAP

Let’s examine SAP-SD ABAP, a very particular yet prevalent workload in enterprise computing, to demonstrate our point. On a SPARC T5 system running the most recent Solaris 11 release, these measurements were made.

Simulated benchmark users entered SD transactions after logging into the SAP system. The 100% mark on the X-Axis represents the maximum number of SD-Users and SAP transaction throughput the system could support.

The operating system recorded CPU use (Y-Axis) at 0%, 12.5%, 25%, 50%, 60%, 75%, 90%, and 100% of the maximum number of SD-Users throughout a series of test runs.

The diagram does not depict a straight diagonal line, contrary to what one may mistakenly believe. Instead, we observe that the operating system only reports 8% CPU use with 92% idle at 25% of the SD-User / maximum throughput load.

The system only looks to be 21% busy and 79% idle at half of the maximum throughput.

To put it another way, we are already operating at 80% of our maximum throughput when the OS displays 50% CPU utilization, thus we cannot anticipate that adding another load will double throughput while maintaining response time. Despite this, clients frequently make this error and report it to us.

The workload (application or mix of applications) and CPU architecture have a significant impact on the curve in the diagram (number of hardware threads, shared computing resources, etc.).

The majority of programs using multi-threaded architectures are likely to exhibit this non-linear behavior (more or less pronounced).

With the introduction of multi-thread/multi-core CPU architectures, solutions capacity planning has become a considerably more complicated process.

To determine how much more load can be added to an existing system, one must assess both the additional demand and the current resource usage.

Final thought 

In addition to tools like vmstat, iostat, mpstat, or prstat, Solaris 11 and even the most recent update to Solaris 10 feature several performance monitoring tools like pgstat, cpustat, and cputrack that enable a much more precise observation of CPU resource utilization.

To comprehend a specific workload, additional tools like Oracle Solaris Studio Performance Analyzer might be extremely helpful.

Related Article: 

How Do Bootloaders Work What Functions Does it Have? (Explained)

Leave a Reply