Pre Migration Site Down 7/14 ~11:05 AM CT

Discussion in 'Announcements, How do I, Feedback' started by Nathan, Jul 14, 2010.

Tags:
  1. Nathan

    Nathan Founder

    Mar 30, 2009
    25,144
    10,032
    113
    Writer
    Short North
    Ratings:
    +10,049 / 0 / -0
    Today there was a bit of down time where when trying to connect first you received a database error message and then a Site not Available message. This outage lasted until ~11:23 AM CT

    The cause of the database error was the swap files being at 100%, data could not be processed. To fix this I had to reboot. This took about 7 min all told and was the cause of the Site not Available message.

    I'm in the process of learning how to identify what is causing the swap dirs to be filled, but I suspect it is the Top Gear files. I'm also learning how to create a script that will clean these up before the file gets to the same point again. In the mean time I'll probably have to reboot the server now and again. I'll try to schedule these for time window when the least amount of members will be affected.

    Thanks for supporting M/A and bearing with my learning curve on running a web server.

    Also if you are UNIX/Linux guru and know this stuff...lets talk please.
     
  2. Metalman

    Metalman Well-Known Member
    Lifetime Supporter

    Sep 29, 2009
    12,569
    7,457
    113
    Owner of a small custom metal fabrication company,
    Columbus, Ohio
    Ratings:
    +7,495 / 1 / -0
    I was on when it happened. I'm glad that's all it was. I had fears it was another "meltdown".:Thumbsup:
     
  3. Nathan

    Nathan Founder

    Mar 30, 2009
    25,144
    10,032
    113
    Writer
    Short North
    Ratings:
    +10,049 / 0 / -0
    Mechanisms are in place and have been tested that will prevent another meltdown like the one we experienced before. Backups are run every hour and stored off site. The most we would lose is an hours worth of material.
     
  4. Minidave

    Minidave Well-Known Member
    Lifetime Supporter

    Dec 22, 2009
    5,089
    3,228
    113
    Male
    Overland Park, Ks
    Ratings:
    +3,591 / 1 / -0
    Wierd, when it happened I got a message that said Google was unavailable - I use Google for my homepage - yet I could go directly to other sites, then a little while later all was right with the world again!

    Strange goings-on out there on the interwebs!

    Good job on finding the culprit and getting the site up again Nathan, well done!

    Have a cold one on me..... :beer
     
  5. lotsie

    lotsie Club Coordinator

    May 5, 2009
    3,924
    400
    83
    stagehand/part time detailer
    Right here
    Ratings:
    +400 / 0 / -0
    When I get yakking, that could be boat loads of useless info lost:frown2::lol:

    Mark
     
  6. Nathan

    Nathan Founder

    Mar 30, 2009
    25,144
    10,032
    113
    Writer
    Short North
    Ratings:
    +10,049 / 0 / -0
    Working with the host we have expanded the temp space to 5GB from 512MB. This will prevent the issue in the future. Video uploads will not be counted in this tmp space. This was done in relation to a MySQL repair maintenance operation.
     
  7. galoki

    galoki New Member

    Sep 8, 2009
    128
    0
    0
    Ratings:
    +0 / 0 / -0
    pesky swap files! your Host should know better then to provision such a small swap space.
     

Share This Page