Skip to Main Content

Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Can't cancel/remove stuck jobs

721339Sep 8 2009 — edited Sep 17 2009
This is my first venture into OSB and the first backup I've tried to do. I'm learning quickly.

I have two clusters each with two nodes. One cluster has nodes sdorac2a and sdorac2b; the other sdorac4a and sdorac4b. Both are configured the same and each has 3 OCFS2 cluister filesystems mounted to /usr/local, /home and /apps.

I created a dataset containing the following:

include host sdorac4b
include host sdorac2b
include path /home
include path /usr/local
include path /apps
exclude name core
exclude name *~

I have two tape drives configured in OSB on 2a and 2b and created a schedule to do a one time backup starting at 18:00 last night using that dataset to backup to the tape drive on 2a only. At least I think I did. I might have made a mistake and told it to use the tape drive on 2b also before changing it to use only 2a. I suspect that because of the catxcr output.

This morning both backups were still running. catxcr for jobs 1.1 and 1.2 both show the same output, but on different media nodes.

# obtool catxcr 1.1
2009/09/07.18:00:59 ______________________________________________________________________
2009/09/07.18:00:59
2009/09/07.18:00:59 Transcript for job 1.1 running on sdorac2a
2009/09/07.18:00:59
Backup started on Mon Sep 07 2009 at 18:01:01
Volume label:
Volume UUID: 918979c6-7df5-102c-9a98-0024817dfd32
Volume ID: VOL000001
Volume sequence: 1
Volume set owner: root
Volume set created: Mon Sep 07 18:01:01 2009
Original UUID: 918979c6-7df5-102c-9a98-0024817dfd32

Archive label:
File number: 1
File section: 1
Owner: root
Client host: sdorac4b
Backup level: 0
S/w compression: no
Archive created: Mon Sep 07 18:01:01 2009
Encryption: off


Dumping all files in /home

Dumping all files in /usr/local

Dumping all files in /apps
Opening device /dev/sg0 failed - device is busy (OB scsi device driver)



# obtool catxcr 1.2
2009/09/07.18:01:00 ______________________________________________________________________
2009/09/07.18:01:00
2009/09/07.18:01:00 Transcript for job 1.2 running on sdorac2b
2009/09/07.18:01:00
Backup started on Mon Sep 07 2009 at 18:01:03
Volume label:
Volume UUID: 93f53858-7df5-102c-bab1-0024817e168a
Volume ID: VOL000002
Volume sequence: 1
Volume set owner: root
Volume set created: Mon Sep 07 18:01:03 2009
Original UUID: 93f53858-7df5-102c-bab1-0024817e168a

Archive label:
File number: 1
File section: 1
Owner: root
Client host: sdorac2b
Backup level: 0
S/w compression: no
Archive created: Mon Sep 07 18:01:03 2009
Encryption: off


Dumping all files in /home

Dumping all files in /usr/local

Dumping all files in /apps
Opening device /dev/sg0 failed - device is busy (OB scsi device driver)

I've no idea why this would happen, but decided to cancel the jobs before looking into it further as they've just been sitting there all night.

I tried cancelling the jobs but this hasn't worked. They're just in a pending state now.

# obtool lsjob
Job ID Sched time Contents State
---------------- ----------- ------------------------------ ---------------------------------------
1.1 09/07.18:00 backup sdorac4b running since 2009/09/07.18:00; cancellation pending
1.2 09/07.18:00 backup sdorac2b running since 2009/09/07.18:00; cancellation pending


catxcr has added an extra line for each to the transcript (just different pid numbers):

Error: [27325] killed

How do I get rid of them now? Is there a way to force remove them?

Rgds,

John
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Oct 15 2009
Added on Sep 8 2009
17 comments
5,338 views