Creating Temporary Files and Directories
Consider a situation where you want to retrieve 100 records from a file with 10,000 records. You will need a place to store the
extracted information, perhaps in a temporary file, while you do further processing on it.
Temporary files (and directories) are meant to store data for a short time. Usually, one arranges it so that these files disappear when
the program using them terminates. While you can also use touch to create a temporary file, this may make it easy for hackers to gain
access to your data.
The best practice is to create random and unpredictable filenames for temporary storage. One way to do this is with the mktemp
utility, as in the following examples.
The XXXXXXXX is replaced by the mktemp utility with random characters to ensure the name of the temporary file cannot be easily
predicted and is only known within your program.
Example of Creating a Temporary File and
Directory
Sloppiness in creation of temporary files can lead to real damage, either by accident or if there is a malicious actor. For example, if
someone were to create a symbolic link from a known temporary file used by root to the /etc/passwd file, like this:
$ ln -s /etc/passwd /tmp/tempfile
There could be a big problem if a script run by root has a line in like this:
echo $VAR > /tmp/tempfile
The password file will be overwritten by the temporary file contents.
To prevent such a situation, make sure you randomize your temporary file names by replacing the above line with the following lines:
TEMP=$(mktemp /tmp/tempfile.XXXXXXXX)
echo $VAR > $TEMP
Discarding Output with /dev/null
Certain commands like find will produce voluminous amounts of output, which can overwhelm the console. To avoid this, we can
redirect the large output to a special file (a device node) called /dev/null. This pseudofile is also called the bit bucket or black hole.
All data written to it is discarded and write operations never return a failure condition. Using the proper redirection operators, it can
make the output disappear from commands that would normally generate output to stdout and/or stderr:
$ ls -lR /tmp > /dev/null
In the above command, the entire standard output stream is ignored, but any errors will still appear on the console. However, if one
does
$ ls -lR /tmp >& /dev/null
both stdout and stderr will be dumped into /dev/null.
Random Numbers and Data
It is often useful to generate random numbers and other random data when performing tasks such as:
● Performing security-related tasks
● Reinitializing storage devices
● Erasing and/or obscuring existing data
● Generating meaningless data to be used for tests.
Such random numbers can be generated by using the $RANDOM environment variable, which is derived from the Linux kernel’s built-in
random number generator, or by the OpenSSL library function, which uses the FIPS140 algorithm to generate random numbers for
encryption
To read more about FIPS140, see http://en.wikipedia.org/wiki/FIPS_140-2
How the Kernel Generates Random Numbers
Some servers have hardware random number generators that take as input different types of noise signals, such as thermal noise and
photoelectric effect. A transducer converts this noise into an electric signal, which is again converted into a digital number by an A-D
converter. This number is considered random. However, most common computers do not contain such specialized hardware and,
instead, rely on events created during booting to create the raw data needed.
Regardless of which of these two sources is used, the system maintains a so-called entropy pool of these digital numbers/random
bits. Random numbers are created from this entropy pool.
The Linux kernel offers the /dev/random and /dev/urandom device nodes, which draw on the entropy pool to provide random
numbers which are drawn from the estimated number of bits of noise in the entropy pool.
/dev/random is used where very high quality randomness is required, such as one-time pad or key generation, but it is relatively slow
to provide values. /dev/urandom is faster and suitable (good enough) for most cryptographic purposes.
Furthermore, when the entropy pool is empty, /dev/random is blocked and does not generate any number until additional
environmental noise (network traffic, mouse movement, etc.) is gathered, whereas /dev/urandom reuses the internal pool to produce
more pseudo-random bits.
Lab 3: Using Random Numbers
Write a script which:
1. Takes a word as an argument.
2. Appends a random number to it.
3. Displays the answer.
Lab Solution: Using Random Numbers
Create a file named testrandom.sh, with the content below.
#!/bin/bash
##
# check to see if the user supplied in the parameter.
[[ $# -eq 0 ]] && echo "Usage: $0 word" && exit 1
echo "$1-$RANDOM"
exit 0
Make it executable and run it:
student:/tmp> chmod +x testrandom.sh
student:/tmp> ./testrandom.sh strA
strA-29294
student:/tmp>./testrandom.sh strB
strB-23911
student:/tmp>./testrandom.sh strC
strC-27782
student:/tmp>
Summary (1 of 2)
● You can manipulate strings to perform actions such as comparison, sorting, and finding length.
● You can use Boolean expressions when working with multiple data types, including strings or
numbers, as well as files.
● The output of a Boolean expression is either true or false.
● Operators used in Boolean expressions include the && (AND), ||(OR), and ! (NOT) operators.
● We looked at the advantages of using the case statement in scenarios where the value of a
variable can lead to different execution paths.
● Script debugging methods help troubleshoot and resolve errors.
● The standard and error outputs from a script or shell commands can easily be redirected into
the same file or separate files to aid in debugging and saving results
● Linux allows you to create temporary files and directories, which store data for a short duration,
both saving space and increasing security.
● Linux provides several different ways of generating random numbers, which are widely used.
Logical Volume Manager
In Linux, Logical Volume Manager (LVM) is a device mapper target that
provides logical volume management for the Linux kernel.
Most modern Linux distributions are LVM-aware to the point of being able to
have their root file systems on a logical volume.
Device mapper
The device mapper is a framework provided by the Linux kernel for mapping
physical block devices onto higher-level virtual block devices. It forms the
foundation of the logical volume manager (LVM), software RAIDs and dm-crypt
disk encryption, and offers additional features such as file system snapshots.
Device mapper works by passing data from a virtual block device, which is
provided by the device mapper itself, to another block device. Data can be
also modified in transition, which is performed, for example, in the case of
device mapper providing disk encryption or simulation of unreliable hardware
behavior.
Common uses
1. Creating single logical volumes of multiple physical volumes or entire hard disks
(somewhat similar to RAID 0, but more similar to JBOD), allowing for dynamic
volume resizing.
2. Managing large hard disk farms by allowing disks to be added and replaced
without downtime or service disruption, in combination with hot swapping.
3. On small systems (like a desktop), instead of having to estimate at installation time
how big a partition might need to be, LVM allows filesystems to be easily resized as
needed.
4. Performing consistent backups by taking snapshots of the logical volumes.
5. Encrypting multiple physical partitions with one password.
6. LVM can be considered as a thin software layer on top of the hard disks and
partitions, which creates an abstraction of continuity and ease-of-use for managing
hard drive replacement, repartitioning and backup.
Overview of Logical Volume Manager (LVM)
Before working with LVM it’s important you first understand some basic
concepts around physical volumes, volume groups, logical volumes, and the
file system.
● Physical Volume (PV): This can be created on a whole physical disk (think
/dev/sda) or a Linux partition.
● Volume Group (VG): This is made up of at least one or more physical
volumes.
● Logical Volume (LV): This is sometimes referred to as the partition, it sits
within a volume group and has a file system written to it.
● File System: A file system such as ext4 will be on the logical volume.
Various elements of the LVM
LAB1: Basics of LVM
List all block devices.
[root@server ~]# lsblk | grep sd
sda 8:0 0 8G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 7G 0 part
sdb 8:16 0 1G 0 disk
sdc 8:32 0 1G 0 disk
sdd 8:48 0 16G 0 disk
└─sdd1 8:49 0 16G 0 part /repos/centos
sde 8:64 0 1G 0 disk
sdf 8:80 0 1G 0 disk
We use sdb and sdc to create physical volumes and group them in a volume group.
[root@server ~]# pvcreate /dev/sdb
Physical volume "/dev/sdb" successfully created.
[root@server ~]# pvcreate /dev/sdc
Physical volume "/dev/sdc" successfully created.
LAB1: Basics of LVM
[root@server ~]# vgcreate datavg /dev/sdb /dev/sdc
Volume group "datavg" successfully created
We create a logical volume of 500MB in the volume group and create a file system in the volume, type ext4.
[root@server ~]# lvcreate -n datavol1 -L 500 datavg
Logical volume "datavol1" created.
[root@server ~]# mkfs -t ext4 /dev/datavg/datavol1
We create a Label in the file system, then we create a mountpoint and then we mount the volume using that Label.
[root@server ~]# tune2fs -L datavol1 /dev/datavg/datavol1
[root@server ~]# mkdir /datavol1
[root@server ~]# mount -L datavol1 /datavol1
We add a line to /etc/fstab for mounting at boot. And we test whether the line is ok.
[root@server ~]# echo "LABEL=datavol1 /datavol1 ext4 defaults 0 0" >> /etc/fstab
[root@server ~]# umount /datavol1
[root@server ~]# mount /datavol1
[root@server ~]# df -h | grep datavol1
/dev/mapper/datavg-datavol1 477M 2.3M 445M 1% /datavol1
Resize of a Logical Volume
Here we show you how to expand an LVM volume or partition in Linux by first
resizing logical volume followed by resizing the file system to take advantage
of the additional space.
This process is extremely easy to do with LVM as it can be done on the fly with
no downtime needed, you can perform it on a mounted volume without
interruption. In order to increase the size of a logical volume, the volume
group that it is in must have free space available.
check free disk space: vgdisplay
[root@CentOS7 ~]# vgdisplay
--- Volume group ---
VG Name centos
System ID
Format lvm2
Metadata Areas 2
Metadata Sequence No 6
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 2
Max PV 0
Cur PV 2
Act PV 2
VG Size 20.74 GiB
PE Size 4.00 MiB
Total PE 5309
Alloc PE / Size 4030 / 15.74 GiB
Free PE / Size 1280 / 5.00 GiB
VG UUID VvG6Sp-wIgb-LTh0-szdU-s9R1-a6K9-qHassI
What if we don’t have space? vgextend
If you do not have any or enough free space in the volume group, you will first
need to expand the volume group to complete the resize. Alternatively if you
have multiple LVM partitions, you could shrink a different logical volume first
to create space within the volume group.
Add two PVs to a VG.
vgextend vg00 /dev/sda4 /dev/sdn1
Where does LV belong? [root@CentOS7 ~]# lvdisplay
Now that we have confirmed there is space free --- Logical volume ---
within the volume group, confirm the name of
the logical volume you want to increase as well LV Path /dev/centos/var
as how much space you plan on adding. The
lvdisplay command will show all logical LV Name var
volumes and their current size.
VG Name centos
It will also show the volume group that the logical
volume is a member of, so ensure that the LV UUID
correct volume group has been checked for
enough space with vgdisplay as previously 7PNgg2-ZmnG-a26g-zRoT-PRVM-RDc1-oq6J4M
mentioned to prevent trying to increase a logical
volume that is inside some other volume group. LV Write Access read/write
LV Creation host, time CentOS7, 2015-04-16 07:50:25 +1000
LV Status available
# open 0
LV Size 5.00 GiB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:2
Expand the LV
Now it’s time to expand the logical volume. In the example we are using the -L
flag to increase by a size specified (M for Megabytes, G for Gigabytes, T for
Terabytes). You can alternatively remove the + to increase to the amount
specified rather than by the amount specified.
lvextend -L+5G /dev/centos/var
Rounding size to boundary between physical extents: 4.90 GiB
Size of logical volume centos/var changed from 5.00 GiB (1280 extents) to 10.00 GiB (2560 extents).
Logical volume var successfully resized
lvextend
The previous command will increase the logical volume /dev/centos/var by
5GB, currently it is already 5GB so this will increase it to a total of 10GB. You
could achieve the same with
lvextend -L 10G /dev/centos/var
which will increase the logical volume to 10GB as well, as this is what was
specified with no +.
Alternatively if you instead want to just use all free space in the volume group
rather than specifying a size to increase to, run
lvextend -l +100%FREE /dev/centos/var
Final check
[root@CentOS7 ~]# vgdisplay
--- Logical volume ---
LV Path /dev/centos/var
LV Name var
VG Name centos
LV UUID 7PNgg2-ZmnG-a26g-zRoT-PRVM-RDc1-oq6J4M
LV Write Access read/write
LV Creation host, time CentOS7, 2015-04-16 07:50:25 +1000
LV Status available
# open 0
LV Size 10.00 GiB
Current LE 2560
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:2
Filesystem resize
Now that the logical volume has been extended, we can resize the file system.
This will extend the file system so that it takes up the newly created space
inside the logical volume. The command may differ depending on the type of
file system you are using.
resize2fs /dev/centos/var
xfs_growfs /dev/centos/var
RAID
RAID (Redundant Array of Independent Disks, originally Redundant Array of
Inexpensive Disks) is a data storage virtualization technology that
combines multiple physical disk drive components into one or more
logical units for the purposes of data redundancy, performance
improvement, or both.
Data is distributed across the drives in one of several ways, referred to as
RAID levels, depending on the required level of redundancy and performance.
The different schemes, or data distribution layouts, are named by the word
"RAID" followed by a number, for example RAID 0 or RAID 1. Each schema, or
RAID level, provides a different balance among the key goals: reliability,
availability, performance, and capacity. RAID levels greater than RAID 0
provide protection against unrecoverable sector read errors, as well as
against failures of whole physical drives.
RAID 0 - STRIPE
RAID 0 consists of striping, but no mirroring or parity. Compared
to a spanned volume, the capacity of a RAID 0 volume is the
same; it is sum of the capacities of the disks in the set.
But because striping distributes the contents of each file among
all disks in the set, the failure of any disk causes all files, the
entire RAID 0 volume, to be lost. A broken spanned volume at
least preserves the files on the unfailing disks.
The benefit of RAID 0 is that the throughput of read and write
operations to any file is multiplied by the number of disks
because, unlike spanned volumes, reads and writes are done
concurrently, and the cost is complete vulnerability to drive
failures.
RAID 1 - MIRROR
RAID 1 consists of data mirroring, without parity or striping. Data is
written identically to two drives, thereby producing a "mirrored set"
of drives. Thus, any read request can be serviced by any drive in the
set. If a request is broadcast to every drive in the set, it can be
serviced by the drive that accesses the data first (depending on its
seek time and rotational latency), improving performance.
Sustained read throughput, if the controller or software is optimized
for it, approaches the sum of throughputs of every drive in the set,
just as for RAID 0. Actual read throughput of most RAID 1
implementations is slower than the fastest drive. Write throughput
is always slower because every drive must be updated, and the
slowest drive limits the write performance. The array continues to
operate as long as at least one drive is functioning.
RAID 5 - block level stripe and distributed parity
RAID 5 consists of block-level striping with distributed parity.
Parity information is distributed among the drives, requiring all
drives but one to be present to operate. Upon failure of a single
drive, subsequent reads can be calculated from the distributed
parity such that no data is lost. RAID 5 requires at least three
disks. Like all single-parity concepts, large RAID 5
implementations are susceptible to system failures because of
trends regarding array rebuild time and the chance of drive
failure during rebuild. Rebuilding an array requires reading all
data from all disks, opening a chance for a second drive failure
and the loss of the entire array.
In August 2012, Dell posted an advisory against the use of RAID 5 in any configuration on
Dell EqualLogic arrays and RAID 50 with "Class 2 7200 RPM drives of 1 TB and higher
capacity" for business-critical data.
RAID 6 - block level stripe and double distributed parity
RAID 6 consists of block-level striping with double distributed parity. Double parity provides fault
tolerance up to two failed drives. This makes larger RAID groups more practical, especially for
high-availability systems, as large-capacity drives take longer to restore.
RAID 6 requires a minimum of four disks. As with RAID 5, a single drive failure results in reduced
performance of the entire array until the failed drive has been replaced.With a RAID 6 array, using
drives from multiple sources and manufacturers, it is possible to mitigate most of the problems
associated with RAID 5. The larger the drive capacities and the larger the array size, the more
important it becomes to choose RAID 6 instead of RAID 5.
Nested RAID
RAID 0+1: creates two stripes and mirrors them. If a single drive failure occurs
then one of the stripes has failed, at this point it is running effectively as RAID
0 with no redundancy. Significantly higher risk is introduced during a rebuild
than RAID 1+0 as all the data from all the drives in the remaining stripe has to
be read rather than just from one drive, increasing the chance of an
unrecoverable read error (URE) and significantly extending the rebuild
window.
RAID 1+0: creates a striped set from a series of mirrored drives. The array can
sustain multiple drive losses so long as no mirror loses all its drives.
RAID 10
RAID 50 - 60
RAID 100
HW Implementation
The hardware-based array manages the RAID subsystem independently from the host. It presents a
single disk per RAID array to the host.
A Hardware RAID device connects to the SCSI controller and presents the RAID arrays as a single SCSI
drive. An external RAID system moves all RAID handling "intelligence" into a controller located in the
external disk subsystem. The whole subsystem is connected to the host via a normal SCSI controller
and appears to the host as a single disk.
RAID controller cards function like a SCSI controller to the operating system, and handle all the actual
drive communications. The user plugs the drives into the RAID controller (just like a normal SCSI
controller) and then adds them to the RAID controllers configuration, and the operating system
won't know the difference.
SW Implementation
Software RAID implements the various RAID levels in the kernel disk (block device) code. It offers the
cheapest possible solution, as expensive disk controller cards or hot-swap chassis are not required.
Software RAID also works with cheaper IDE disks as well as SCSI disks. With today's faster CPUs,
Software RAID outperforms Hardware RAID.
The Linux kernel contains an MD driver that allows the RAID solution to be completely hardware
independent. The performance of a software-based array depends on the server CPU performance
and load.
Create a SW RAID array on Ubuntu
The mdadm utility can be used to create and manage storage arrays using
Linux's software RAID capabilities.
Administrators have great flexibility in coordinating their individual storage
devices and creating logical storage devices that have greater performance or
redundancy characteristics.
We will go over a number of different RAID configurations that can be set up
using an Ubuntu 16.04 server.
Prerequisites
● A non-root user with sudo privileges on an Ubuntu 16.04 server
● A basic understanding of RAID terminology and concept
● Multiple raw storage devices available on your server
Creating a RAID 0 Array
The RAID 0 array works by breaking up data into chunks and striping it across
the available disks. This means that each disk contains a portion of the data
and that multiple disks will be referenced when retrieving information.
● Requirements: minimum of 2 storage devices
● Primary benefit: Performance
● Things to keep in mind: Make sure that you have functional backups. A
single device failure will destroy all data in the array.
Identify the Component Devices
To get started, find the identifiers for the raw disks that you will be using:
$ lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT
Output
NAME SIZE FSTYPE TYPE MOUNTPOINT
sda
sdb 100G disk
vda
├─vda1 100G disk
└─vda15
20G disk
20G ext4 part /
1M part
Create the Array
To create a RAID 0 array with these components, pass them in to the
mdadm --create command. You will have to specify the device name you
wish to create (/dev/md0 in our case), the RAID level, and the number of
devices:
$ sudo mdadm --create --verbose /dev/md0 --level=0 \
--raid-devices=2 /dev/sda /dev/sdb
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid0 sdb[1] sda[0]
209584128 blocks super 1.2 512k chunks
unused devices: <none>
Create and Mount the Filesystem
Next, create a filesystem on the array:
sudo mkfs.ext4 -F /dev/md0
Create a mount point to attach the new filesystem:
sudo mkdir -p /mnt/md0
You can mount the filesystem by typing:
sudo mount /dev/md0 /mnt/md0
Check whether the new space is available by typing:
df -h -x devtmpfs -x tmpfs
Save the Array Layout
To make sure that the array is reassembled automatically at boot, we will have to adjust the /etc/mdadm/mdadm.conf file.
You can automatically scan the active array and append the file by typing:
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
Afterwards, you can update the initramfs, or initial RAM file system, so that the array will be available during the early boot
process:
sudo update-initramfs -u
Add the new filesystem mount options to the /etc/fstab file for automatic mounting at boot:
echo '/dev/md0 /mnt/md0 ext4 defaults,nofail,discard 0 0' \
| sudo tee -a /etc/fstab
Your RAID 0 array should now automatically be assembled and mounted each boot.
Linux System Logs
Linux system administrators often need to look at log files for troubleshooting
purposes. In fact, this is the first thing any sysadmin would do.
Linux and the applications that run on it can generate all different types of
messages, which are recorded in various log files. Linux uses a set of
configuration files, directories, programs, commands and daemons to create,
store and recycle these log messages. Knowing where the system keeps its log
files and how to make use of related commands can therefore help save
valuable time during troubleshooting.
We will have a look at different parts of the Linux logging mechanism.
Default Log File Location
The default location for log files in Linux is /var/log.
You can view the list of log files in this directory with a simple
ls -l /var/log
command.
Viewing Log File Contents
Here are some common log files you will find under /var/log:
● wtmp
● utmp
● dmesg
● messages
● maillog or mail.log
● spooler
● auth.log or secure
The wtmp and utmp files keep track of users logging in and out of the system. You cannot directly
read the contents of these files using cat – there are specific commands for that.
We will now use some of these commands.
who
To see who is currently logged in to the Linux server, simply use the who
command.
This command gets its values from the /var/run/utmp file (for CentOS and
Debian) or /run/utmp (for Ubuntu).
last
The last command tells us the login history of users:
lastlog
To see when did someone last
log in to the system, use
lastlog
Text-based log files
For other text-based log files, you can use cat, head or tail commands to
read the contents.
In the example below, I am trying to look at the last ten lines of
/var/log/messages file in a Debian box:
debian@debian:~$ sudo tail /var/log/messages
The rsyslog Daemon
At the heart of the logging mechanism is the rsyslog daemon.
This service is responsible for listening to log messages from different parts of
a Linux system and routing the message to an appropriate log file in the
/var/log directory.
It can also forward log messages to another Linux server.
The rsyslog Configuration File
The rsyslog daemon gets its configuration information from the rsyslog.conf file. The file is located under the /etc
directory.
Basically, the rsyslog.conf file tells the rsyslog daemon where to save its log messages. This instruction comes
from a series of two-part lines within the file.
This file can be found at rsyslog.d/50-default.conf on ubuntu.
The two part instruction is made up of a selector and an action. The two parts are separated by white space.
The selector part specifies what's the source and importance of the log message and the action part says what to do
with the message.
The selector itself is again divided into two parts separated by a dot (.). The first part before the dot is called facility
(the origin of the message) and the second part after the dot is called priority (the severity of the message).
Together, the facility/priority and the action pair tell rsyslog what to do when a log message matching the
criteria is generated.