The admins and developers agree on a plan for a security fix, but in the ensuing fiasco, realize they weren't on the same page at all
Two tech teams are better than one, right? Only if they communicate
well. For all their shared knowledge and experience, a company's system
admins and developers can't prevent a security fix from derailing when
both groups heard only what they wanted from the other at the outset.
Throughout the years, the previous developers had done an excellent job
at securing this application. It was integrated with the company’s
Active Directory installation, and each user had access to only the
needed areas within the application based on their role within the
company.
Each user was also tied to a very specific role in the database, so they
could only access specific information on the database level. Any
changes to data was logged in a central location, and we could easily
tell what user made what change to what data at a very fine-grained
level.
For running the batch processing in the evenings, we created a user,
Mr_Roboto, and added him to Active Directory and to the database.
Through an automated process, Mr_Roboto was responsible for running the
batch jobs at night, and all the logging for those batch jobs were done
under his name. We gave Mr_Roboto as little access as needed, and "he"
did his job as expected with few problems for years.
Oh, that security hole
In the same company there was a very bad security practice that needed
to be remedied. On numerous servers, the same user name and password was
used for the System Admin account. If anyone compromised that account
on one server, then that person would have administrative access to
pretty much any server within the domain. It was a bad setup and needed to be fixed by the system admins, but the issue persisted for reasons unknown to the development team.
Thus, when one of the admins pulled me aside on a Friday and told me
that for security reasons they were going to be restricting users on the
servers, I couldn't have been happier. The system admin said they were
going to rename accounts and change passwords so that the same user
would not have identical access to different servers across the company.
He then asked if this would be an issue for our .Net application. I told
him it wouldn't because we didn’t use the system admin accounts for our
application -- change away!
Later that day, I was informed that the changes had been put into
effect. I went home for the weekend happy, thinking our company’s
servers would be much safer.
The fix that wasn’t
When I arrived at work on Monday, I found that disaster had struck. My
inbox had messages from Mr_Roboto showing the evening batch jobs he’d
attempted had failed. There were open tickets from multiple users,
reporting their batch jobs had not run. I had users calling to ask what
was happening. I checked, and all of the batch processes had failed.
My team went into crisis mode immediately and tried to figure out what
had happened. Looking over errors, it become apparent that Mr_Roboto no
longer had access to the servers he was supposed to be running on. In
fact, Mr_Roboto was no longer in Active Directory.
Horrified, we called up the system admin to find out what was going on.
The response: “Yes, I changed Mr_Roboto to Mr_RobotoA and Mr_RobotoB and
only gave them access to one of your two processing servers,
respectively.”
Our displeasure with this situation was immediately and loudly
communicated to the system admin. After a few minutes he agreed to
change his “security upgrades” back to way they were before. It wouldn’t
fix the security log-in problem, but at that point we had a larger
issue on our hands: A whole weekend’s worth of batch processing still
needed to our attention.
As a last resort, my boss had the developers use their machines to run
the batch processes. Thankfully, by the end of the day we had cleared up
the backlog, and our users ran all their reports and sent them to the
correct parties.
It’ll be better next time – right?
Our team conducted a postmortem on the situation and came to the following conclusions:
First, as easy it was to blame the system admin, I should have requested
more details before allowing the change. I had wanted to hear -- and
took away from the conversation -- that the system admin accounts were
being changed, but that was not what the system admin was saying. Also,
any future changes needed to have an email listing what changes and why.
For the future we resolved not to stand in the way of any positive
changes, but wanted a clear explanation about what changes were being
made before anything was done.
Second, we also realized that the system admins didn't know enough about
what we were doing. This was handled by a two-hour meeting with the
admin team in which we brought donuts and explained how our application
worked and described the incident as “the weekend in which Mr_Roboto got
fired.”
Thankfully, positive changes came out of the debacle, such as better communication
between the development and system administrator teams. But one huge
problem remained: The system administrator accounts were still the same
for all the servers. For some reason, the sys admins never fixed this
security problem -- and hadn’t by the time I left the company. I'm told
the failure stands to this day.
Source: http://www.infoworld.com
No comments:
Post a Comment