Hello all — I’m hoping to sanity-check a persistent GPU TDR issue that seems hardware/platform related, but only shows up reliably during Solidworks use. I have plenty of experience with solidworks itself crashing, but not this situation where the whole system crashes repeatedly.
System
- Laptop: Lenovo ThinkPad P16 Gen 3
- GPU: NVIDIA RTX Pro 3000 (Blackwell)
- OS: Windows 11 Pro
- SOLIDWORKS: 2024 (majority), 2025 (also occurs)
- Simple / Basic software install. Nothing unusual.
- All undocked use, without external monitor.
- Crashes have all occurred while plugged in.
Symptoms / frequency
Over the last ~50 days I’ve had ~30 system-level crashes that appear to be TDR events (system goes down; not “just SOLIDWORKS” crashing). Most are the same signature; a recent one added ACPI errors beforehand. Maybe 2 of the 30 happened without solidworks running. Crashes typically happen during longer CAD sessions, pretty much at random. I can't seem to 'do' any particular thing to encourage a crash. Some days I could work for 8hrs straight, with no crashes. Next day, 2 crashes.
What I’ve already tried
- Many “expected/known stable” NVIDIA driver versions (including Studio / recommended branches)
- SOLIDWORKS 2024 (majority of crashes) and SOLIDWORKS 2025 (still crashes)
- Long troubleshooting session with GoEngineer support (settings review, driver + system settings, etc.)
- Lenovo repairs on the original laptop: (yes, they wanted to do this in two steps)
- Motherboard replaced → still crashes
- GPU replaced → still crashes
- Windows reinstall multiple times → still crashes
- Laptop replaced (brand new unit) → more stable, but still crashing
- Lenovo's engineering team has my original laptop, but has not been able to replicate the behavior without Solidworks.
- Certified drivers are not listed for this machine on Solidworks site. Certified drivers for other Blackwell Pro 3000 gpus are listed for other hardware. (This is odd, because Lenovo advertises that their P series hardware are all vetted prior to release. Apparently not in this case.)
Current situation:
New replacement laptop feels more stable, and has had only 5 crashes in about 2 weeks:
- 4/5 match the original signature
- 1/5 was maybe “worse”: TDR preceded by ACPI errors ~40 minutes earlier
Typical crash signature (~29/30 events)
In Windows Event Viewer, I consistently see WHEA-Logger Event ID 17 entries around the crash:
- WHEA 17 – PCIe endpoint: NVIDIA GPU
VEN_10DE / DEV_2F38 - WHEA 17 – PCIe device: Intel
VEN_8086 / DEV_272B
Then: TDR / system crash → bugcheck + minidump
Newer “ACPI-preceded” crash (1 event)
About 40 minutes before the TDR:
- 1× ACPI Event ID 13
- 4× ACPI Event ID 15
Then the rest of the crash looked the same as above.
Has anyone seen some version of this with similar hardware?
- Has anyone seen a P16 Gen 3 (or similar platform) produce WHEA-17 PCIe spam + GPU TDR under SOLIDWORKS?
- Does the pairing of NVIDIA PCIe endpoint WHEA-17 plus an Intel PCIe device WHEA-17 suggest anything specific (power management? link state? BIOS/firmware? docking/USB4/TB path?)?
- Any targeted changes worth trying that I might have missed (BIOS settings, PCIe ASPM / Link State Power Management, hybrid graphics toggles, power plan knobs, etc.)?
Thanks in advance — I'd love to find a solution before trading this in for something older but more tried and true. I sold my Gen 1 when the Gen 3 arrived, before the crashes started.
