Skip to content

Commit 82c18be

Browse files
Preliminary support for NVIDIA Jetson boards
NVIDIA Jetson device is an insdustrial Linux based embedded aarch64 platfrom with powerful builtin GPU, which is used for AI tasks, mostly for CV purposes. The support is provided via --enable-nvidia-jetson switch in the configure script. All the source code related to the NVIDIA Jetson is placed in the linux/NvidiaJetson.{h,c} source files and hidden by 'NVIDIA_JETSON' C preprocessor define. So, for x86_64 platforms the source code stays unchanged. Additional functionality added by this commit: 1. Fix for the CPU temperature reading. The Jetson device is not supported by libsensors. The CPU has 8 cores with only one CPU temperature sensor for all of them located in the thermal zone file. libsensor might be compiled in or turned off. The additional care was taken to provide successfull build with/without libsensors. 2. The Jetson GPU Meter was added: current load, frequency and temperature. 3. The exact GPU memory allocated by each process is loaded from the nvgpu kernel driver via sysfs and merged to the LinuxProcess data (field LinuxProcess::gpu_mem). The field "GPU_MEM" visualizes this field. For root user only. 4. Additional filter for processes which use GPU right now via hot key 'g', the help is supplied. For root user only. == Technical details == The code tries to find out the correct sensors during the application startup. As an example, the sensors location for NVIDIA Jetson Orin are the following: - CPU temperature: /sys/devices/virtual/thermal/thermal_zone0/type - GPU temperature: /sys/devices/virtual/thermal/thermal_zone1/type - GPU frequency: /sys/class/devfreq/17000000.gpu/cur_freq - GPU curr load: /sys/class/devfreq/17000000.gpu/device/load Measure: - The GPU frequency is provided in Hz, shown in MHz. - The CPU/GPU temperatures are provided in Celsius multipled by 1000 (milli Celsius), shown in Cesius P.S. The GUI shows all temperatures for NVIDIA Jetson with additional precision comparing to the default x86_64 platform. If htop starts with root privileges (effective user id is 0), the experimental code activates. It reads the fixed sysfs file /sys/kernel/debug/nvmap/iovmm/clients with the following content, e.g.: ``` CLIENT PROCESS PID SIZE user gpu_burn 7979 23525644K user gnome_shell 8119 5800K user Xorg 2651 17876K total 23549320K ``` Unfortunately, the /sys/kernel/debug/* files are allowed to read only for the root user, that's why the restriction applies. The patch also adds a separate field 'GPU_MEM', which reads data from the added LinuxProcess::gpu_mem field. The field stores memory allocated for GPU in kilobytes. It is populated by the function NvidiaJetson_LoadGpuProcessTable (the implementation is located in NvidiaJetson.c), which is called at the end of the function Machine_scanTables. Additionally, the new Action is added: actionToggleGpuFilter, which is activated by 'g' hot key (the help is updated appropriately). The GpuFilter shows only the processes which currently utilize GPU (i.e. highly extended nvmap/iovmm/clients table). It is achieved by the filtering machinery associated with ProcessTable::pidMatchList. The code below constructs GPU_PID_MATCH_LIST hash table, then actionToggleGpuFilter either stores it to the ProcessTable::pidMatchList or restores old value of ProcessTable::pidMatchList. The separate LinuxProcess's PROCESS_FLAG_LINUX_GPU_JETSON (or something ...) flag isn't added for GPU_MEM, because currently the functionality of population LinuxProcess::gpu_mem is shared with the GPU consumers filter construction. So, even if GPU_MEM field is not activated, the filter showing GPU consumers should work. This kind of architecture is chosen intentially since it saves memory for the hash table GPU_PID_MATCH_LIST (which is now actually a set), and therefore increases performance. All other approaches convert GPU_PID_MATCH_LIST to a true key/value storage (key = pid, value = gpu memory allocated) with further merge code. == NVIDIA Jetson models == Tested for NVIDIA Jetson Orin and Xavier boards.
1 parent cfb561f commit 82c18be

23 files changed

+554
-28
lines changed

Action.c

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,11 @@ in the source distribution for its full text.
2727
#include "ListItem.h"
2828
#include "Macros.h"
2929
#include "MainPanel.h"
30+
#include "NvidiaJetson.h"
3031
#include "OpenFilesScreen.h"
3132
#include "Process.h"
3233
#include "ProcessLocksScreen.h"
34+
#include "ProcessTable.h"
3335
#include "ProvideCurses.h"
3436
#include "Row.h"
3537
#include "RowField.h"
@@ -646,6 +648,26 @@ static Htop_Reaction actionTogglePauseUpdate(State* st) {
646648
return HTOP_REFRESH | HTOP_REDRAW_BAR | HTOP_KEEP_FOLLOWING;
647649
}
648650

651+
#ifdef NVIDIA_JETSON
652+
static Htop_Reaction actionToggleGpuFilter(State* st) {
653+
static Hashtable *stash = NULL;
654+
655+
Hashtable *GpuPidMatchList = NvidiaJetson_GetPidMatchList();
656+
if (GpuPidMatchList) {
657+
st->showGpuProcesses = !st->showGpuProcesses;
658+
659+
ProcessTable *pt = (ProcessTable *)st->host->activeTable;
660+
if (st->showGpuProcesses) {
661+
stash = pt->pidMatchList;
662+
pt->pidMatchList = GpuPidMatchList;
663+
} else {
664+
pt->pidMatchList = stash;
665+
}
666+
}
667+
return HTOP_REFRESH | HTOP_REDRAW_BAR | HTOP_KEEP_FOLLOWING;
668+
}
669+
#endif
670+
649671
static const struct {
650672
const char* key;
651673
bool roInactive;
@@ -658,6 +680,9 @@ static const struct {
658680
{ .key = " F3 /: ", .roInactive = false, .info = "incremental name search" },
659681
{ .key = " F4 \\: ", .roInactive = false, .info = "incremental name filtering" },
660682
{ .key = " F5 t: ", .roInactive = false, .info = "tree view" },
683+
#ifdef NVIDIA_JETSON
684+
{ .key = " g: ", .roInactive = false, .info = "show GPU processes (root only)" },
685+
#endif
661686
{ .key = " p: ", .roInactive = false, .info = "toggle program path" },
662687
{ .key = " m: ", .roInactive = false, .info = "toggle merged command" },
663688
{ .key = " Z: ", .roInactive = false, .info = "pause/resume process updates" },
@@ -933,6 +958,9 @@ void Action_setBindings(Htop_Action* keys) {
933958
keys['a'] = actionSetAffinity;
934959
keys['c'] = actionTagAllChildren;
935960
keys['e'] = actionShowEnvScreen;
961+
#ifdef NVIDIA_JETSON
962+
keys['g'] = actionToggleGpuFilter;
963+
#endif
936964
keys['h'] = actionHelp;
937965
keys['k'] = actionKill;
938966
keys['l'] = actionLsof;

Action.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,9 @@ typedef struct State_ {
4040
bool pauseUpdate;
4141
bool hideSelection;
4242
bool hideMeters;
43+
#ifdef NVIDIA_JETSON
44+
bool showGpuProcesses;
45+
#endif
4346
} State;
4447

4548
static inline bool State_hideFunctionBar(const State* st) {

CPUMeter.c

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -98,15 +98,33 @@ static void CPUMeter_updateValues(Meter* this) {
9898
}
9999
}
100100

101+
/*
102+
--enable-sensors turns on BUILD_WITH_CPU_TEMP only
103+
--enable-nvidia-jetson turns on both NVIDIA_JETSON and BUILD_WITH_CPU_TEMP
104+
*/
101105
#ifdef BUILD_WITH_CPU_TEMP
102106
if (settings->showCPUTemperature) {
103107
double cpuTemperature = this->values[CPU_METER_TEMPERATURE];
104-
if (isNaN(cpuTemperature))
108+
if (isNaN(cpuTemperature)) {
105109
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "N/A");
106-
else if (settings->degreeFahrenheit)
107-
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%3d%sF", (int)(cpuTemperature * 9 / 5 + 32), CRT_degreeSign);
108-
else
109-
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%d%sC", (int)cpuTemperature, CRT_degreeSign);
110+
} else if (settings->degreeFahrenheit) {
111+
cpuTemperature = ConvCelsiusToFahrenheit(cpuTemperature);
112+
/* Fahrenheit scale gives almost x2 more precise value than Celsius scale => no need to show fractional part */
113+
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%3d%sF", (int)cpuTemperature, CRT_degreeSign);
114+
} else if (settings->showCPUTemperatureFractional) {
115+
/*
116+
- Modern CPUs has temperature sensors which give a precise value with 3 digits in the fractional part,
117+
see hwmon files, e.g. /sys/class/hwmon/hwmon.../temp1_input, one digit in the fractional part is quite
118+
enough right now.
119+
- If your CPU is above 100C - you have a real problem, no need to print it pretty.
120+
- The formatter "%04.1f" guarantees filling zero in the fractional part, e.g. strings like "37.0C" appears,
121+
the side effect is that temperature value '5C' is shown as "05.0C"
122+
*/
123+
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%04.1f%sC", cpuTemperature, CRT_degreeSign);
124+
} else {
125+
/* if your CPU is above 100C - you have a real problem, no need to print it pretty */
126+
xSnprintf(cpuTemperatureBuffer, sizeof(cpuTemperatureBuffer), "%2d%sC", (int)cpuTemperature, CRT_degreeSign);
127+
}
110128
}
111129
#endif
112130

CRT.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
202202
[GPU_ENGINE_2] = ColorPair(Yellow, Black),
203203
[GPU_ENGINE_3] = ColorPair(Red, Black),
204204
[GPU_ENGINE_4] = A_BOLD | ColorPair(Blue, Black),
205+
[GPU_FILTER] = A_BOLD | ColorPair(Red, Cyan),
205206
[GPU_RESIDUE] = ColorPair(Magenta, Black),
206207
[PANEL_EDIT] = ColorPair(White, Blue),
207208
[SCREENS_OTH_BORDER] = ColorPair(Blue, Blue),
@@ -320,6 +321,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
320321
[GPU_ENGINE_2] = A_NORMAL,
321322
[GPU_ENGINE_3] = A_REVERSE | A_BOLD,
322323
[GPU_ENGINE_4] = A_REVERSE,
324+
[GPU_FILTER] = A_REVERSE,
323325
[GPU_RESIDUE] = A_BOLD,
324326
[PANEL_EDIT] = A_BOLD,
325327
[SCREENS_OTH_BORDER] = A_DIM,
@@ -438,6 +440,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
438440
[GPU_ENGINE_2] = ColorPair(Yellow, White),
439441
[GPU_ENGINE_3] = ColorPair(Red, White),
440442
[GPU_ENGINE_4] = ColorPair(Blue, White),
443+
[GPU_FILTER] = ColorPair(Blue, White),
441444
[GPU_RESIDUE] = ColorPair(Magenta, White),
442445
[PANEL_EDIT] = ColorPair(White, Blue),
443446
[SCREENS_OTH_BORDER] = A_BOLD | ColorPair(Black, White),
@@ -556,6 +559,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
556559
[GPU_ENGINE_2] = ColorPair(Yellow, Black),
557560
[GPU_ENGINE_3] = ColorPair(Red, Black),
558561
[GPU_ENGINE_4] = ColorPair(Blue, Black),
562+
[GPU_FILTER] = A_BOLD | ColorPair(Yellow, Cyan),
559563
[GPU_RESIDUE] = ColorPair(Magenta, Black),
560564
[PANEL_EDIT] = ColorPair(White, Blue),
561565
[SCREENS_OTH_BORDER] = ColorPair(Blue, Black),
@@ -674,6 +678,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
674678
[GPU_ENGINE_2] = A_BOLD | ColorPair(Yellow, Blue),
675679
[GPU_ENGINE_3] = A_BOLD | ColorPair(Red, Blue),
676680
[GPU_ENGINE_4] = A_BOLD | ColorPair(White, Blue),
681+
[GPU_FILTER] = A_BOLD | ColorPair(White, Cyan),
677682
[GPU_RESIDUE] = A_BOLD | ColorPair(Magenta, Blue),
678683
[PANEL_EDIT] = ColorPair(White, Blue),
679684
[SCREENS_OTH_BORDER] = A_BOLD | ColorPair(Yellow, Blue),
@@ -790,6 +795,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
790795
[GPU_ENGINE_2] = ColorPair(Yellow, Black),
791796
[GPU_ENGINE_3] = ColorPair(Red, Black),
792797
[GPU_ENGINE_4] = ColorPair(Blue, Black),
798+
[GPU_FILTER] = A_BOLD | ColorPair(Yellow, Green),
793799
[GPU_RESIDUE] = ColorPair(Magenta, Black),
794800
[PANEL_EDIT] = ColorPair(White, Cyan),
795801
[SCREENS_OTH_BORDER] = ColorPair(White, Black),
@@ -905,6 +911,7 @@ static int CRT_colorSchemes[LAST_COLORSCHEME][LAST_COLORELEMENT] = {
905911
[GPU_ENGINE_2] = A_NORMAL,
906912
[GPU_ENGINE_3] = A_BOLD | ColorPair(Cyan, Black),
907913
[GPU_ENGINE_4] = A_BOLD | ColorPair(Cyan, Black),
914+
[GPU_FILTER] = A_BOLD | ColorPair(Red, Cyan),
908915
[GPU_RESIDUE] = A_BOLD,
909916
[PANEL_EDIT] = A_BOLD,
910917
[SCREENS_OTH_BORDER] = A_BOLD | ColorPairGrayBlack,

CRT.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,7 @@ typedef enum ColorElements_ {
129129
GPU_ENGINE_2,
130130
GPU_ENGINE_3,
131131
GPU_ENGINE_4,
132+
GPU_FILTER,
132133
GPU_RESIDUE,
133134
PANEL_EDIT,
134135
SCREENS_OTH_BORDER,

DisplayOptionsPanel.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -185,14 +185,15 @@ DisplayOptionsPanel* DisplayOptionsPanel_new(Settings* settings, ScreenManager*
185185
Panel_add(super, (Object*) CheckItem_newByRef("Also show CPU frequency", &(settings->showCPUFrequency)));
186186
#ifdef BUILD_WITH_CPU_TEMP
187187
Panel_add(super, (Object*) CheckItem_newByRef(
188-
#if defined(HTOP_LINUX)
189-
"Also show CPU temperature (requires libsensors)",
190-
#elif defined(HTOP_FREEBSD)
188+
#if defined(HTOP_FREEBSD) || defined(NVIDIA_JETSON)
191189
"Also show CPU temperature",
190+
#elif defined(HTOP_LINUX)
191+
"Also show CPU temperature (requires libsensors)",
192192
#else
193193
#error Unknown temperature implementation!
194194
#endif
195195
&(settings->showCPUTemperature)));
196+
Panel_add(super, (Object*) CheckItem_newByRef("- Show fractional CPU temperature for Celsius", (&settings->showCPUTemperatureFractional)));
196197
Panel_add(super, (Object*) CheckItem_newByRef("- Show temperature in degree Fahrenheit instead of Celsius", &(settings->degreeFahrenheit)));
197198
#endif
198199
Panel_add(super, (Object*) CheckItem_newByRef("Show cached memory in graph and bar modes", &(settings->showCachedMemory)));

Machine.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ in the source distribution for its full text.
1313
#include <stdlib.h>
1414
#include <unistd.h>
1515

16+
#include "NvidiaJetson.h"
1617
#include "Object.h"
1718
#include "Platform.h"
1819
#include "Row.h"
@@ -129,4 +130,8 @@ void Machine_scanTables(Machine* this) {
129130

130131
Row_setUidColumnWidth(this->maxUserId);
131132
Row_setPidColumnWidth(this->maxProcessId);
133+
134+
#ifdef NVIDIA_JETSON
135+
NvidiaJetson_LoadGpuProcessTable(this->activeTable->table);
136+
#endif
132137
}

MainPanel.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,12 @@ static void MainPanel_drawFunctionBar(Panel* super, bool hideFunctionBar) {
197197
} else if (this->state->failedUpdate) {
198198
FunctionBar_append(this->state->failedUpdate, CRT_colors[FAILED_READ]);
199199
}
200+
201+
#ifdef NVIDIA_JETSON
202+
if (this->state->showGpuProcesses) {
203+
FunctionBar_append("GPU", CRT_colors[GPU_FILTER]);
204+
}
205+
#endif
200206
}
201207

202208
static void MainPanel_printHeader(Panel* super) {

Makefile.am

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,11 @@ linux_platform_sources = \
215215
zfs/ZfsArcMeter.c \
216216
zfs/ZfsCompressedArcMeter.c
217217

218+
if NVIDIA_JETSON
219+
linux_platform_headers += linux/NvidiaJetson.h
220+
linux_platform_sources += linux/NvidiaJetson.c
221+
endif
222+
218223
if HAVE_DELAYACCT
219224
linux_platform_headers += linux/LibNl.h
220225
linux_platform_sources += linux/LibNl.c

Settings.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -472,6 +472,8 @@ static bool Settings_read(Settings* this, const char* fileName, const Machine* h
472472
#ifdef BUILD_WITH_CPU_TEMP
473473
} else if (String_eq(option[0], "show_cpu_temperature")) {
474474
this->showCPUTemperature = atoi(option[1]);
475+
} else if (String_eq(option[0], "show_cpu_temperature_fractional")) {
476+
this->showCPUTemperatureFractional = atoi(option[1]);
475477
} else if (String_eq(option[0], "degree_fahrenheit")) {
476478
this->degreeFahrenheit = atoi(option[1]);
477479
#endif
@@ -703,6 +705,7 @@ int Settings_write(const Settings* this, bool onCrash) {
703705
printSettingInteger("show_cpu_frequency", this->showCPUFrequency);
704706
#ifdef BUILD_WITH_CPU_TEMP
705707
printSettingInteger("show_cpu_temperature", this->showCPUTemperature);
708+
printSettingInteger("show_cpu_temperature_fractional", this->showCPUTemperatureFractional);
706709
printSettingInteger("degree_fahrenheit", this->degreeFahrenheit);
707710
#endif
708711
printSettingInteger("show_cached_memory", this->showCachedMemory);
@@ -808,6 +811,7 @@ Settings* Settings_new(const Machine* host, Hashtable* dynamicMeters, Hashtable*
808811
this->showCPUFrequency = false;
809812
#ifdef BUILD_WITH_CPU_TEMP
810813
this->showCPUTemperature = false;
814+
this->showCPUTemperatureFractional = false;
811815
this->degreeFahrenheit = false;
812816
#endif
813817
this->showCachedMemory = true;

0 commit comments

Comments
 (0)