Clicky

Jump to content
Inoue Katsu

Perrils of a sysadmin

Recommended Posts

I figured it might be entertaining for others to read about the kind of things I run into at work that tend to be rather bizarre yet rather likely that you'll run into when you work in IT. And it gives me a chance to vent a little =p

To give a bit of background information, I am a tier 3 sys/network admin at an office with around 200 computers connected to 3 other buildings with 200-250 computers each which each have their own IT infrastructure and staff. (each is its own organization, this might sound weird but for privacy reasons I'm leaving out what kind of office I work at, which would have it make more sense:-) )

Tier 3 means I get to deal with disasters when everybody else is freaking out, considering our small size tier 1 are interns who answer the phone and deal with small (l)user problems, tier 2 are the normal sysadmin's (there's 3 of em) and then there's me.

I also deal with smaller problems when there's nothing broken but usually I just growl and let others deal with those, taking a more advisory role instead which gives me time to slack off. I also occasionally pitch in for our neighbors when they have something they cant figure out, being the only networking expert.

We also run a rather varied range of OS' on our servers ranging from Win2k3/8 to Enterprise & desktop linux versions, as well as 'true' unix so there's lots of variation in problems. Our main network is not Microsoft so that means we do have a lot less issues then our neighbors on the network.

I have also read the BOFH diaries more then once from start to finish =p

So today I get an alert that the CPU of our main mailserver(runs on linux) is generating an above average load, while I'm firing up the CPU monitor on a shell the phone rings .. its one of our interns and he's going on about how somebody's mail client is freaking out and after spending about 1 minute to try and figure out what is going on (not the brightest intern) it turns out this user 'magically' had 56000 new emails in their mailbox ..

Now I know for a fact that we only have around 100k received e-mails in our system on average, me yelling at people and an system wide auto delete policy having educated our users on a semi decent mailbox management.

Right here we obviously had the source of our high cpu load, the server having generated all those 56000 e-mails in about 5 minutes time... After inquiring where these messages were coming from I checked the 'sender' user mailbox for weird rules (rule generated ping-pong messages are frequent, auto reply on auto reply) and what not but that was clean so I decided to take a better look at the stuffed mailbox which they were trying to delete messages from. Deleting messages is a lot slower then creating automatic messages so this was going at an amazing rate of 10 per second.

After doing some more inquiring (you REALLY have to learn how to interrogate people and what questions to ask when you work in IT) it turned out that just before they had sent out an appointment to a few (thankfully just a few) other people.

This appointment consisted of a repeated appointment for every day for the rest of the year, and they had marked it as that they wanted a notification about who opened it, who accepted or declined it and further status tracking..

So for each user, for each day of the year, upon opening/accepting the appointment, the system generated a pile of messages and sent those back to the originating user .. coming to a grand total of over 56000 messages.

Deleted user mailbox, ran a cleanup and created a new mailbox .. problem solved.

Protip: over scale your e-mail server hardware EVEN if it runs on linux (dual quadcore laughed at this). This will save you a lot of future problems and user performance complaints as well as allows you to postpone upgrades for an extra year or 2 :-)

Next to this, business computers these days come with 22" by default here. When your old standard monitor is 19" and you get a new shipment your company WILL become a statistical anomaly for people with eye problems. Statistically seeing all the people for the whole region who would have trouble with a 19" monitor work in my office.

We do not replace all computers at once, there are 4 different models in use that get rotated out.

Now you think this would not really be a problem, but people come up with the weirdest reasons for why they need a new computer/monitor.

This even leads towards aggressive behavior to IT staff (usually towards the interns), trying to trick interns into making promises that normal staff has already declined, attempts to bypass IT by going straight to the top with angry faces and/or teared up eyes and if that fails they'll actually call in sick. You'll notice people who work close to staff often have high end computers even though all they do is ms office while somebody across the hall is actually having efficiency issues with their brand new software X that runs poorly on a 4 year old computer.

IT constantly has to balance things to keep people happy, even 'cheat' by giving somebody who would benefit from a fast computer but already has a 1 year old computer the latest model so you can move last years model with a new monitor to somebody who is causing drama and tell them its a new computer.

We do this a lot :-) Gives some extra work for us (us being mostly just the interns) but keeps people happy.

Link to comment
Share on other sites

Another few unusual incidents this week..

We have an application launcher that has a network loaded list of applications people can install and launch (if they have access to them) .. so somebody called saying that if they clicked an icon the wrong program launched. after starting a remote control session with me stating that it was impossible for it to do something like that and the guy on the other side insisting that it was the case I watched this person click an icon and getting the appropriate program for said icon, but he was not clicking the icon he was claiming he was clicking...

Turns out a new piece of software shifted the icons and he was just clicking the area where it used to be without really looking.

Chalk up another one for the facepalm list.

We're also rolling out new desktop computers, including to the department with the highest concentration of people with eye 'problems' and other unusual medical conditions.

(The top weird medical condition so far is RSI in the pointing finger because they had to click too much to scroll in a certain piece of software, thus they demanded (they took far enough to cry and go home sick) mice with scroll wheels (which you incidentally also operate with the pointing finger, but this was ignored repeatedly when mentioned by IT).)

So this person calls and goes on about how they had shortcuts to files on their desktop, after telling them it is policy that local files are not managed or supported by IT and that users have their own responsibility when it comes to what they put on their desktop this person started to give the first indications of panic.

After listening to how 'they had not put the shortcuts there' and the 'files were critical to their work' and some other excuses about how important it was that she got her shortcuts back I asked if she knew where the files were located on the network or if anyone else nearby knew.

Obvious answer: 'no idea'

We do have a few milion office files on our network but access is fairly well limited for an individual user so all they have to do is look at 2 individual network drives and click around a bit to find *most* of their required files, but this person has already stated that she had no idea what a 'home directory' was and throwing up the usual excuses 'I'm not as tech as you guys' 'isn't it *your* job to know these things ?' etc etc (I'm quite sure it says 'basic computer and office knowledge' is listed in everyones job requirements these days, and basic computer knowledge is more then 'know where the power button is')

I already knew where things were going to end up but people like this need to be put some fear into every now and then, so i hinted to how their files were lost forever and that their old computer was probably already wiped (i knew it wasn't)

Now at this point people usually start (feigning) hopelessness and going on about 'what am i supposed to do now, i might as well go home!' to which i usually reply 'its your job, just do what you think is best to get your work done' (I cant tell em to go home, because they will and then it'll be *my* fault)

This is where they get frustrated and confused enough about my lack of 'understanding' that i can get them to hang up, knowing that they'll go straight to their chief to complain and come up with some drama story. This gives me 5 minutes of respite to do some actual work and send an intern off to go fetch that persons old computer too.

Surely enough,5 minutes later the phone rings, its their chief .. at this time the person in question is still in the room usually so I have somebody else answer the phone to tell them I'm not available right now and either to take a message or ask if I should phone back. Its easier to have a conversation about somebody with a chief when they are not in the room.

Meanwhile an intern is booting up the persons old computer and comes over to tell me what these life or death shortcuts are about, turns out it was a link to some self made travel cost declaration form. a vacation schedule and some non work related Internet site...

5 extra minutes have past and I call the chief back, the person had *just* left the room (meaning they spent another few minutes whining) and I listen to him ask why we did not copy this persons files over.

I give him the standard explanation about how all files *need* to be on the network for backups and such, and that the intern who swapped this persons computer remembered seeing the desktop files but said it was only what I described earlier, which are quite easy to locate again on the network and our local Intranet. Then I told him that in a gesture of goodwill I had already told an intern to go and try to find this persons computer to try and recover the files, but seeing as we had a lot of identical computers it would take some time to locate it but we'd do our best and I gave him the rundown of our experiences with this user and their lack of computer know-how, suggested basic computer training course.

Chief happy and informed, user thinking they were victorious.

Told intern to make a calender appointment to go copy old files over some time next week and to bring a booklet with basic computer lessons for senior people (person is in her early 40's) as well.

The art of IT is to make the other person feel how stupid they really are without saying it out loud.

Link to comment
Share on other sites

ROFL

When I actually help out my clients, especially Gobshite Quarterly ( http://www.gobshitequarterly.com/ ), they are actually more computer savvy than the person you just described. Hell, GobQ even keeps a backup of their computers on an external hard drive ('course they use Mac, but not going to debate Mac vs. PC). And all of my clients are about the same age, if not older, than that person.

Link to comment
Share on other sites

Mac has this way back thing (or something) that actually works quite well to put its files somewhere else. From there you can take em in the normal network backup.

We just have a policy that nothing should be stored on the local drive other then system software and all the network drives that contain user data get picked up by the network backup.

Link to comment
Share on other sites

Its been a while since I've had anything happen worth mentioning, but after some internal drama and vacation time here's something you'll run into a lot especially if you are the mail system BOFH..

Yes, for e-mail systems you will need to be BOFH, not just 'Administrator'.. People take e-mail for granted and do not realize the difference between an enterprise e-mail system and something ISP's provide.

So somebody called me and told me "I'm trying to send this 20MB attachment to somebody else in the building and its not letting me."

The first thing I said directly afterwards is 'e-mail is not intended for those kinds of attachments .. who are you trying to send it to and don't you have a shared network folder that you both have access to ..'

After explaining to this guy that we have this amazing thing called a 'file server' that 'serves files' to everyone in the building it turned out that he was trying to send a file -from- a folder that was specifically named and purposed for storing files for that whole department, to somebody who was also in that -same- department...

Then he goes "But it is so much easier to open it from the e-mail.."

This is the point where I go:

no.jpg

I told him that: 'if i would allow him to send that 20mb e-mail, the recipient would have an overflowing inbox because of a size limit, has to spent time cleaning said indox, spend time calling ME asking for a bigger inbox, complain about having to do too many inbox cleaning to their manager and the resistance from IT to increase said inbox size and sending limit, then said manager spends at least an hour or 2 lobbying with other managers to see if anyone else complained about the same thing, then spend another hour during a staff meeting raising said issue to top management, then top management having to go to IT asking why we did not increase the maximum file size, then IT(me) having to spend 30 minutes explaining to top management about how e-mail servers have less disk space then a file server that is meant to serve files, how it would lead to file decentralization and multiple versions of the same file floating around the network and how bad that would be when decisions would be made based on the wrong version of a file that somebody said was done, then top management has to raise this again in the next staff meeting to the other managers and then back to the employees.

And all that just because you think it is easier to read an e-mail instead of browsing to a folder, do you have any idea how many man hours and money it will cost this company if I let you send that e-mail ?'

After a few seconds of stuttering and silence he goes 'I'll just send them a link to the location.'

yea you do that ..

Just to spread some mail server management experience..

For maintaining e-mail systems there are a few things you need to do:

- Build your own system from scratch so you know how it works!

If you start somewhere that has a running system that is relatively new and not scheduled for replacement, grab an old server or desktop (or 2 desktops if the production mail server has more then 1 physical server or install vmware on the old server), install the same OS on it (screw licensing) and try to build your own replica of the production environment. This will help loads with troubleshooting because you'll know all the crap e-mails have to pass through to get from A to B.

- Read up on what problems others are having.

Dig through forums and mailing lists and find out what makes your mail system cry. If said problems are related to a combination of settings, check said settings on your system and find out if there are side effects to changing said settings and raise an issue to possibly change them.

- Do the 'don't do this' things on your test system just to see what kind of symptoms they cause and then try to fix it.

- Especially if you run an Exchange server, it will DIE horribly if you run out of disk space! Take the biggest allowed attachment, multiply that by the number of users, then that multiply by 10 if you have ~200 users (probably 30 for 500 users) and that is the minimum disk space you need to have available at all times. If it gets to that point you are possibly in deep shit as a single idiot e-mailing a maximum size attachment to the *whole* company and then getting a bunch of people -replying- to said e-mail with the attachment still attached will kill your server and give you lots of overtime hours.

Rule # 1 of sysadmining is that overtime caused by planned upgrades or unforeseen events is OK, overtime caused by stupid users or failures that could have easily been prevented is TOTALLY NOT OK! You want to be doing fun things outside of work.

There needs to be a trigger for any mail system that when disk space runs low on its storage, it needs to stop all input!

Thankfully there are e-mail systems other then exchange that handle attachments more smartly, I use GroupWise that will just store an attachment once and let everyone who its been sent to access that 1 file. But any mail server *will* die possibly beyond repair when it really runs out of space.

-Overspec your server BADLY, give it at least raid 6 (2 disk failures) or better, plenty of ram and CPU power.

The #1 thing people complain about is slow e-mail and no internet access. Your core applications can be down for 15-30 minutes before somebody will call, if your internet or e-mail is not working it takes 15-30 SECONDS before all your phone lines lite up.

Beefy e-mail server will also be able to deal with stupid people generating tons of e-mails (see my first post:P) and not bother other people all that much.

-Research anti spam methods, do NOT rely on commercial sales talk when it comes to anti spam. Dig around the internet for terms like 'Bayes / Grey listing / SPF / different RBL lists / every possible plugin that spamassassin can work with'

We've been using an opensource scanner called 'MailScanner' for years, fully loaded with multiple RBL's, automatic bayes learning, multiple virus scanners (this is IMPORTANT! never -ever- trust a single AV, use 2 or MORE! We have 3) blocking anything executable virus or not, archive scanning etc and all this is shielded off with a greylisting MTA that blocks about 25000 spam e-mails out of the 27000 that are attempted to be delivered daily. Grey listing is an important tool to keep your scanner load low and your e-mail flowing.

Do not do anti spam and virus checks on the e-mail server itself, it needs to worry about handling your e-mail so leave it alone!

this leads to:

-Never -ever- hook your main corporate mail server up directly to the internet .. your e-mail server is a primary target for all the bad people and a holder of lots of sensitive data! You can even go as far as to firewall it separately and not allow it access to the Internet at all other then traffic to a webmail server and the gateway server(which is where you do your anti spam/virus).

-Have backups and possibly archiving that allow single item recovery(with deduplication, especially for Exchange) that your users can access. Your users will throw away important e-mails and ask you to recover them, especially staff members who go on vacation tend to clean our their inbox before they go, then come back afterwards and figure out they deleted too much.

And last but not least:

-STICK TO YOUR SIZING POLICY AND MAKE SURE IT IS A STRICT ONE! All the mail boxes filled together should not fill up the whole system. Only give people more space if they can come up with valid reasons why they should have a bigger inbox, but when you do make sure you act hesitant tell them it usually does not happen. That way they will appreciate you more yet do not think it is easy to get their inbox size increased and in turn they stop deleting e-mails they do not need anymore.

Link to comment
Share on other sites

Building a new transparent anti virus http filter out of open source products (dansguardian / squid), added some trickery to the proxy to add gaussian blur to every picture and made it active for our intern as a test without telling him.

(Can't seem to upload an example)

So at some point the intern goes 'the internet is fuzzy'... I then explained to him that this was because the fiberoptic internet line was out of focus and was probably dirty .. removed the picture altering config line and told him to pull the fiber cable and blow on it to clear any dust that would have accumulated ..

He gave me this 'yea right' look but i managed to persuade him to go into the patch room, pull the cable and blow on it anyways .. while he was doing that I reloaded the proxy and watched him do his cleaning act.

He came back glaring at me because I obviously had this 'im pulling your leg' kind of grin, sat down, hit F5 and found all the pictures to be normal again.

Saying he was amazed is a bit of an understatement :-)

IT isn't all bad when you're the infrastructure admin :-)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.



×

Important Information

By using this site, you agree to our Privacy Policy, and Terms of Use.