Reply
Thread Tools
Posts: 35 | Thanked: 504 times | Joined on Jan 2013 @ Germany
#1
Hi everybody,

sorry for the short notice but we will do some heavy maintainance to the maemo.org infrastructure tomorrow, starting at 10:00 CET (09:00 UTC).

All systems will be affected.

We expect to be down for at least 6 hours as we do upgrades on the underlying hypervisors.

What we will do:
  • Do an image backup of all machines
  • Upgrade the underlying hypervisors
  • Upgrade individual machines

Sorry for any inconvenience this might cause.

Best,

Falk
__________________
--
We reject kings, presidents and voting.
We believe in rough consensus and running code.
- David Clark
 

The Following 22 Users Say Thank You to fstern For This Useful Post:
peterleinchen's Avatar
Posts: 4,117 | Thanked: 8,901 times | Joined on Aug 2010 @ Ruhrgebiet, Germany
#2
Thanks for notificatiln.

@tmo admin
possibly to be made sticky on overall level?
__________________
SIM-Switcher, automated SIM switching with a Double (Dual) SIM adapter
--
Thank you all for voting me into the Community Council 2014-2016!

Please consider your membership / supporting Maemo e.V. and help to spread this by following/copying this link to your TMO signature:
[MC eV] Maemo Community eV membership application, http://talk.maemo.org/showthread.php?t=94257

editsignature, http://talk.maemo.org/profile.php?do=editsignature
 

The Following 5 Users Say Thank You to peterleinchen For This Useful Post:
Posts: 35 | Thanked: 504 times | Joined on Jan 2013 @ Germany
#3
Hi everyone,

tl;dr: half of infrastucture broken, fix expected early next week, film at eleven.

This maintainance didn't go to plan, here's a short post-mortem:

Timeline:

10:00 - start updates and backups on blade-a
14:30 - backups and updates complete on blade-a, reboot confirmed successful
14:31 - uptime induced filesystem check after 1347 days
15:00 - start of backups on blade-b
17:12 - filesystem check complete, blade-a up and running
17:30 - first systems on blade-a confirmed up and working
18:30 - software upgrade on stage and mail complete
20:15 - backups of blade-b finished and copied onto blade-a backup space
20:16 - start of updates on blade-b
21:00 - updates on blade-b complete, reboot
21:01 - blade-b stuck in boot with corrupt bios image in flash
23:30 - all available remote recovery options tried, none working
23:40 - decision to go for Plan B, boot talk.maemo.org on blade-a, redirect everything else to talk.m.o
23:45 - blade-b turned off through IPMI
23:53 - talk.m.o available again

Fallbacks in place:

www.maemo.org, wiki.maemo.org, garage.maemo.org are redirected to talk.maemo.org

Next Action Items:

I'll visit the datacenter monday after work (around 18:00 CET) to try to recover the bios of the broken machine with a physical USB stick.

If this is successful we'll migrate talk.m.o back to it's original host and reenable www.m.o, wiki.m.o, garage.m.o through DNS after the VMs and the blade are confirmed working


Best,

xes & falk
__________________
--
We reject kings, presidents and voting.
We believe in rough consensus and running code.
- David Clark
 

The Following 23 Users Say Thank You to fstern For This Useful Post:
pichlo's Avatar
Posts: 6,445 | Thanked: 20,981 times | Joined on Sep 2012 @ UK
#4
My browser complaints about a wrong certificate; is this a side effect of the update? Is it temporary?
(Details: the name on the cert does not match the URL.)
__________________
Русский военный корабль, иди нахуй!
 

The Following 2 Users Say Thank You to pichlo For This Useful Post:
peterleinchen's Avatar
Posts: 4,117 | Thanked: 8,901 times | Joined on Aug 2010 @ Ruhrgebiet, Germany
#5
Originally Posted by joerg_rw View Post
many thanks for this massive effort ...
+1

A hint for all remaining N9 user: we have again no automatic network (WLAN auto/manual) detection. A nice screenshot attached (maybe later, my N9 does not let me select it )

--edit
Originally Posted by pichlo View Post
My browser complaints about a wrong certificate; is this a side effect of the update? ...
Guess so as these corrections/redirections were also made earlier this year.
Attached Images
 
__________________
SIM-Switcher, automated SIM switching with a Double (Dual) SIM adapter
--
Thank you all for voting me into the Community Council 2014-2016!

Please consider your membership / supporting Maemo e.V. and help to spread this by following/copying this link to your TMO signature:
[MC eV] Maemo Community eV membership application, http://talk.maemo.org/showthread.php?t=94257

editsignature, http://talk.maemo.org/profile.php?do=editsignature

Last edited by peterleinchen; 2016-11-20 at 08:38. Reason: added pic from N900
 

The Following User Says Thank You to peterleinchen For This Useful Post:
Posts: 638 | Thanked: 1,692 times | Joined on Aug 2009
#6
Let me share the screen that our Supermicro server showed to reward us for a day of work...
http://www.supermicro.nl/products/sy...cfm?parts=SHOW

Then, we also discovered that Supermicro wants money to obtain a license to flash bios remotely using the IPMI.
(anyway, we are not sure this could work to recovery the bios)

Supermicro: really, thanks.
Attached Images
  
 

The Following 9 Users Say Thank You to xes For This Useful Post:
Win7Mac's Avatar
Community Council | Posts: 664 | Thanked: 1,648 times | Joined on Apr 2012 @ Hamburg
#7
Possible to replace the chip?
__________________
Nokia 5110 > 3310 > 6230 > N70 > N9 BLACK 64GB
Hildon Foundation Board member
Maemo Community e.V. co-creator, founder and director since Q4/2016
Current Maemo Community Council member
 

The Following 2 Users Say Thank You to Win7Mac For This Useful Post:
Posts: 638 | Thanked: 1,692 times | Joined on Aug 2009
#8
@win7mac
at the moment i can't say which is the "weight" of the problem we are facing until tomorrow Falk will make some tests while trying to restore the blade.

Then, while with your personal pc / board / laptop you can try whatever you want and any hack, any trick is done because you have nothing to loose, with servers you have to enter in a different perspective where you have to consider risks, best options, time to fix, quality of result and possibility to make more damages.

So, my reply is: i think that no one tries to remove a chip from a server mainboard without a spare board or without a warranty of result.

Last edited by xes; 2016-11-21 at 00:03.
 

The Following 7 Users Say Thank You to xes For This Useful Post:
Win7Mac's Avatar
Community Council | Posts: 664 | Thanked: 1,648 times | Joined on Apr 2012 @ Hamburg
#9
I wasn't suggesting any tricks or hacks. Some BIOS are replaceable, but since it's not listed on that parts list, that's probably not an option.
__________________
Nokia 5110 > 3310 > 6230 > N70 > N9 BLACK 64GB
Hildon Foundation Board member
Maemo Community e.V. co-creator, founder and director since Q4/2016
Current Maemo Community Council member
 

The Following 3 Users Say Thank You to Win7Mac For This Useful Post:
Posts: 35 | Thanked: 504 times | Joined on Jan 2013 @ Germany
#10
Originally Posted by joerg_rw View Post
plus we have two spare blades, incl BIOS chips (if the flash of the now-down blade is actually defect)
edit: I think it would actually be a great opportunity to swap the blades for wear leveling
No, we don't. All we have ist two empty slots in the Chassis.

Best,

Falk
__________________
--
We reject kings, presidents and voting.
We believe in rough consensus and running code.
- David Clark
 

The Following 5 Users Say Thank You to fstern For This Useful Post:
Reply

Thread Tools

 
Forum Jump


All times are GMT. The time now is 21:06.