PDA

View Full Version : Server down - fixed


Steve_S
09-05-2002, 01:37 PM
Hello All,

Our dedicated server was down for about 1.5 hours this AM. Some of you could not reach the site.

The /tmp partition filled up. This has been fixed. The server know uses a directory for this (not a partition) as in /tmp so it's unlikely to ever run out of space.

My appolagies for any inconvenience this may have caused you.

madaweb
09-05-2002, 02:27 PM
Hi Steve
Thought you should know that for some of us it was down from around 9:30AM Eastern time to about 2:00-2:30PM this afternoon.
Glad your back.

Steve_S
09-05-2002, 02:54 PM
Yep. I checked my error_log again and the down time was about 4 hours. I was sleeping :) for most of this time.

What a way to start the day :)

singloon
09-06-2002, 12:06 AM
what filled up the /tmp directory ?

Furton
09-06-2002, 07:54 AM
I'm guessing it was server logs right Steve? I had the exact same problem last week.

Steve_S
09-06-2002, 12:32 PM
Good question and I'm not 100% sure but server logs sounds like it's probably the reason.

DI fixed the issue so fast I was still poking around on the box to see when the sites came up. They even called me which has never happened before and I can assure you, I'm a small fry with them. My tape BU also gives me 4 free tickets every month.

The story:

Check email. NO email. PW prompt

None of the sites resolve in a browser

Three large bolts of cafeene

Hello Dr. Root. Telnet in. Yep the box is still running and was not rebooted. <cry - uptime 101 days> Lookie lookie, theirs Georges neat Mysql tweaks. :) DB still running and topics appear on remote sites via last10.php. All the other services are running.

Another large bolt of Java

Restart Apache and Named

Run to my other line. Still no sites in a browser. Trace is fine.

I can see the geeks hitting the site.

Open ticket with the emergency flag

While the ticket is open I take a gander at my error_log

<Gasp>

Dozens of these:

[Thu Sep 5 11:54:38 2002] [error] mod_gzip: EMPTY FILE [/tmp/_18435_347_248.wrk] in sendfile2
[Thu Sep 5 11:54:38 2002] [error] mod_gzip: Make sure all named directories exist and have the correct permissions.

22 minutes later sites resolve and the phone rings. Hello <surprise>

A minor tweak or two and all is well

Today:

Fri Sep 6 13:30:56 EDT 2002


1:30pm up 1 day, 25 min, 0 users, load average: 0.05, 0.15, 0.14
107 processes: 106 sleeping, 1 running, 0 zombie, 0 stopped
Mem: 772124K av, 587608K used, 184516K free, 0K shrd, 13184K buff
Swap: 1575744K av, 18340K used, 1557404K free 402856K cached



Http processes currently running = 36
Mysql processes currently running = 42

Netstat information summary
1 ESTABLISHED
15 TIME_WAIT
18 LISTEN


:)

singloon
09-07-2002, 05:13 AM
are it was mod_gzip's tmp files... reason why i configure mod_gzip's tmp directory as /home/modgziptmp or something :)