Bug 670 - In fglrx/intel hybrid systems X server crashes on reboot after having used the integrated GPU
Summary: In fglrx/intel hybrid systems X server crashes on reboot after having used th...
Status: CLOSED WONTFIX
Alias: None
Product: AMD Catalyst™Proprietary Display Driver
Classification: Unclassified
Component: Kernel Module (show other bugs)
Version: .archived
Hardware: Radeon HD 6000 Series Linux
: low major
Assignee: nobody
URL:
Depends on:
Blocks:
 
Reported: 2012-12-09 17:13 CST by Nick Andrik
Modified: 2014-01-13 15:31 CST (History)
4 users (show)



Attachments
igpu crash log (4.24 KB, text/x-log)
2012-12-09 17:16 CST, Nick Andrik
Details
dgpu crash log (5.47 KB, text/x-log)
2012-12-09 17:18 CST, Nick Andrik
Details
igpu crash log (6.90 KB, text/x-log)
2012-12-09 18:48 CST, Nick Andrik
Details
dgpu crash log (8.20 KB, text/x-log)
2012-12-09 18:51 CST, Nick Andrik
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Andrik 2012-12-09 17:13:50 CST
Description of problem: 
I have fglrx drivers in my hybrid fglrx/intel system and I am using the integrated GPU, setup by:
amdconfig --px-igpu
The X server crashes on boot half of the times (once it works, once it crashes and so on..)

Steps to reproduce:
1. Install fglrx driver in hybrid fglrx/intel setup
2. Select the integrated GPU (intel)
3. Reboot

Actual result: 
The X server crashes 

Expected result: 
The X server should load normally all the times
Comment 1 Nick Andrik 2012-12-09 17:16:05 CST
Created attachment 635 [details]
igpu crash log

Crash log when I reboot with the iGPU selected
Comment 2 Nick Andrik 2012-12-09 17:18:16 CST
Created attachment 636 [details]
dgpu crash log

Crash log when I reboot with the dGPU selected.

This is achieved by booting once successfully using the iGPU, run 
amdconfig --px-dgpu
and the reboot again
Comment 3 Nick Andrik 2012-12-09 18:48:57 CST
Created attachment 637 [details]
igpu crash log

Updated (starting earlier in time) log including when I reboot with the iGPU selected
Comment 4 Nick Andrik 2012-12-09 18:51:23 CST
Created attachment 638 [details]
dgpu crash log

Updated (starting earlier in time) log when I reboot with the dGPU selected
Comment 5 Nick Andrik 2012-12-09 19:03:47 CST
I believe the problem is there:

[fglrx] Reserved FB block: Shared offset:0, size:1000000
[fglrx] Reserved FB block: Unshared offset:f936000, size:3ca000
[fglrx] Reserved FB block: Unshared offset:3fff4000, size:c000
[fglrx:firegl_cail_early_init] *ERROR* CAIL: already initialized!
[fglrx:hal_init_gpu] *ERROR* Failed to early init cail!
[fglrx] device open failed with code -1


When I don't get the crash, this point is like this instead:

[fglrx] Reserved FB block: Shared offset:0, size:1000000 
[fglrx] Reserved FB block: Unshared offset:f936000, size:3ca000 
[fglrx] Reserved FB block: Unshared offset:3fff4000, size:c000 
[fglrx] IRQ 51 Disabled
Comment 6 Nick Andrik 2012-12-10 12:31:08 CST
My computer is Sony Vaio VPCCB2S1E with these GPUs:
Intel® HD Graphics 3000
AMD Radeon™ HD 6630M
Comment 7 Kirils Solovjovs 2012-12-12 12:40:46 CST
I can confirm this behavior even with hybrid fglrx/fglrx system. I think this is related to #652. I also think that this bug is of critical importance. The card needs to be set to integrated mode either via graphycal catalyst app, or via amdconfig/aticonfig for this bug to appear.

Tested with Linux 3.5.0 and 3.7.0. No difference. 
I have amd 64bit cpu A8-4500M with two video adapters: HD 7640G and 7670M.

BIOS update did not help.

I've been able to pinpoint that this has to do with PX_GPUDOWN=R00010000 being added to /etc/ati/amdpcsdb.

I worked around this issue by doing this as root:
cd /etc/ati
cat amdpcsdb | grep -v GPUDOWN > tmpset
cat tmpset > amdpcsdb; chattr +i amdpcsdb


(To revert this workaround do: chattr -i amdpcsdb)


One of the times I also got to see a kernel BUG message. The letters where flickering and jumping around a bit during the message.
kernel BUG at /build/buildd/linux-3.5.0/mm/slub.c:3474
invalid opcode: 0000 [#2] SMP
CPU 0
Modules linked in: pci_stub ... fglrx(PO) .. rts_pstor(C) ... aes_x86_64 atl1c wmi video
Pid: 1024, comm: gdbus Tainted: P        D   C 0 3.5.0-17-generic #28 Ubuntu Acer Aspire V3-551G/VA50_CM
...
call trace:
eventfd_ctx_put+0x25/0x30
eventfs_release+0x35/0x40
fput+0x118/0x260
filp_close+0x66/0xa0
sys_close+0x9e/0x100
system_call_fastpath+0x16/0x1b
Code: 89 d9 48 89 da 4c 89 d6 e8 fe 94 50 00 eb a8 49 f7 02 00 c0 00 00 74 13 4c 89 d7 e8 a0 16 fc ff eb 95 4d 8b 52 30 e9 42 ff ff ff <0f> 0b 0f 1f 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89
I hope this information is useful.


Also, I do not think this is a crash, more of a freeze. The system freezes just before X should be started. There is a cursor on the screen, but it's not blinking. Reboot with magic-sysrq works.
Comment 8 Nick Andrik 2012-12-12 17:39:33 CST
Hi Kirils,

First of all, many thanks for giving a workaround.
I had no idea that ati keeps a configuration database and I was wondering how is it possible that this bug is persistent.

Concerning the workaround, I think this is (maybe) easier (as root):
aticonfig --del-pcs-key=DDX,PX_GPUDOWN

By doing this after every reboot, I eliminated the freezes/crashes on X start. 


Maybe something interesting that helps debugging the issue is this:
andrikos@Gauss: ~ $ aticonfig --get-pcs-key=DDX,PX_GPUDOWN
Raw Hex (4 bytes): 00010000
andrikos@Gauss: ~ $ sudo aticonfig --del-pcs-key=DDX,PX_GPUDOWN
Error: Key DDX,PX_GPUDOWN not found in PCS database
andrikos@Gauss: ~ $ aticonfig --get-pcs-key=DDX,PX_GPUDOWN
aticonfig: No supported adapters detected

Whatever command I might run after that point, I always get:
No supported adapters detected
Comment 9 Rakot 2012-12-23 17:13:58 CST
What is your Mesa version?
As far as I know, Mesa 9.0 has libGL.so.1.2.0 instead of libGL.so.1.2 in the previous versions. In the meantime switchlibGL uses libGL.so.1.2.

Also I can confirm this bag for AMD Catalyst 12.10 and all 12.11 betas.
My system is:
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Cape Verde [Radeon HD 7700M Series] (rev ff)

Software:
opensuse 12.2
kernel 3.7.1-1-desktop
X server 1.12.3
intel 2.20.0
Mesa 8.0.4
Comment 10 Nick Andrik 2013-01-03 15:04:03 CST
Hi Rakot,

My MESA version is 9.0

I have these libGL.* files in my ubuntu system:
/usr/lib32/fglrx/libGL.so.1 -> libGL.so.1.2
/usr/lib32/fglrx/libGL.so.1.2
/usr/lib/fglrx/libGL.so -> libGL.so.1
/usr/lib/fglrx/libGL.so.1 -> libGL.so.1.2
/usr/lib/fglrx/libGL.so.1.2
/usr/lib/i386-linux-gnu/mesa/libGL.so.1 -> libGL.so.1.2.0
/usr/lib/i386-linux-gnu/mesa/libGL.so.1.2.0
/usr/lib/x86_64-linux-gnu/mesa/libGL.so.1 -> libGL.so.1.2.0
/usr/lib/x86_64-linux-gnu/mesa/libGL.so.1.2.0

Do you think that there should any additional *1.2 *1.2.0 links?

Thanks
Comment 11 Rakot 2013-01-06 11:47:27 CST
(In reply to comment #10)
> Hi Rakot,
> 
> My MESA version is 9.0
> 
> I have these libGL.* files in my ubuntu system:
> /usr/lib32/fglrx/libGL.so.1 -> libGL.so.1.2
> /usr/lib32/fglrx/libGL.so.1.2
> /usr/lib/fglrx/libGL.so -> libGL.so.1
> /usr/lib/fglrx/libGL.so.1 -> libGL.so.1.2
> /usr/lib/fglrx/libGL.so.1.2
> /usr/lib/i386-linux-gnu/mesa/libGL.so.1 -> libGL.so.1.2.0
> /usr/lib/i386-linux-gnu/mesa/libGL.so.1.2.0
> /usr/lib/x86_64-linux-gnu/mesa/libGL.so.1 -> libGL.so.1.2.0
> /usr/lib/x86_64-linux-gnu/mesa/libGL.so.1.2.0
> 
> Do you think that there should any additional *1.2 *1.2.0 links?
> 
> Thanks

I don't think so. As far as I know there should be some changes in switchlibGL and in fglrx driver in order to match behavior of integrated GPU. This is because changing symlinks is not enough (switchlibGL calls ldconfig). Therefore this is one of the reasons why we cannot switch to integrated GPU from discrete one.

Truly speaking I would prefer to run discrete card through PRIME or PRIMUS (the latter is like bumblebee). I think this approach is more intel driver independent than current one. Also it doesn't require to restart x server every time I need to run discrete card. But currently whenever I try to do this fglrx also tries to load intel module and the second x server doesn't start. Is there any option which allows to not to load intel driver?
More details are in https://github.com/Bumblebee-Project/Bumblebee/issues/52 .
Comment 12 Rakot 2013-01-20 23:00:17 CST
Just an update of the state of the bug. I've tried amd catalyst 13.1 and integrated GPU works with Mesa 8.0.4, X server 1.12.3 and kernel 3.7 (discrete card works fine). But X server crashes again on Mesa 9.0. So the problem is partially resolved. 

Should I open new bug for a feature request of loading fglrx without intel driver on intel+amd hybrid laptops in order to use bumblebee?
Comment 13 Michael Cronenworth 2014-01-06 10:32:01 CST
This message is a reminder that your bug is marked as Catalyst 12.10.

The current legacy Catalyst version is 13.1.
The current Catalyst version is 13.12.
The current Catalyst beta version is 13.11.

Approximately 7 days from now the Bugzilla administrator will be removing the 12.10 version. At that time your bug will be CLOSED as WONTFIX.

Bug Reporter: Thank you for reporting this issue. However, the Bugzilla administrator provides this as a unofficial, free service to AMD customers, and I like to keep my systems neat and tidy. If you would like to keep your bug from being closed, please try a new Catalyst version and update the 'version' field if the issue still occurs.

If you are unable to update the version, please make a comment and someone will change it for you.
Comment 14 Michael Cronenworth 2014-01-13 15:30:45 CST
This bug is being closed due to the 'version' being 12.10 after 7 days of the previous closure notice.

Thank you for your bug report.