Update on Hadoop

I have moved on to Hadoop 2 (Cloudera’s CDH4) — and I *definitely* suggest NOT manually maintaining hadoop nodes like I previously posted. It is much easier with Cloudera Manager. I also moved on to Hypertable, effectively replacing HBase for all our tables. It is slightly faster (8ms vs 20ms in django for puts) and the enterprise support is good (see hypertable.org UpTime subscription). The python modules work a lot better as well.

Posted in Uncategorized | Leave a comment

Threaded Upload of PostgreSQL WAL Files to S3 Bucket

Bash script will spawn 10 separate parallel processes of s3-mp-upload.py, which itself is a multipart-parallelized python script that uses boto library to upload 16MB WAL files that are older than 8 hours in /pg_wal folder on our database . The S3 bucket is set to store 3 months of WAL files and anything older goes in Amazon Glacier. Files older than 8 hours are deleted from the database. We are sure to set the x-amz-server-side-encryption:AES256 header to make use of S3′s server side AES256 encryption.

#!/bin/bash

# this script runs in cron.daily
# purpose: find files in $PG_WAL directory that are older than 8 hours
#          and put those files in s3://$S3_BUCKET/. Those files are then
#          deleted, to free space for more WAL to accumulate in $PG_WAL.
# note:    each WAL file is 16MB
# requirements: s3-mp-upload.py in chaturbate_system_files/backups/
#               python
#               boto (python module)
#               argparse (python module)
#               make sure to set server side encryption header in S3 
#               s3cmd to make sure file exists before deleting in s3.

export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX
export ENCRYPT=x-amz-server-side-encryption:AES256
export PATH=/usr/local/bin:$PATH

S3_MP_BIN=/root/system_files/backups/s3-mp-upload.py
S3_BUCKET=pgwal.mybucket.com
PG_WAL=/pg_wal
S3_THREADS=3 #s3-mp-upload threads
WAL_AGE=480 #480 minutes = 8 hours
THREADS=10 #bash threads
CHUNKSIZE=5
QUIET="" #use -q for crons

CNT=1
for i in `find $PG_WAL -type f -mmin +${WAL_AGE}`; do
        if [ $(($CNT % $THREADS)) -eq 0 ]; then
                #wait for spawned uploads x $THREADS to finish
                wait
                for j in $(seq 1 $(($THREADS-1))); do
                        if [ "`s3cmd ls s3://${S3_BUCKET}/${wal[$j]} | awk '{ print $3}'`" != "16777216" ]; then
                                # is this a .backup file?
                                [ `echo ${wal[$j]} | awk -F\. '{print $2}'` ] && continue;

                                echo "problem uploading ${wal[$j]} to $S3_BUCKET!"
                                echo "`s3cmd ls s3://${S3_BUCKET}/${wal[$j]}` -- not 16777216 bytes!"
                                exit 0
                        else
                                $QUIET && echo rm -f $PG_WAL/${wal[$j]}
                                rm -f $PG_WAL/${wal[$j]}
                        fi
                done
        else
                f=`basename $i`
                for j in $(seq 1 $(($THREADS-1))); do
                        [ $(($CNT % $THREADS)) -eq $j ] && wal[$j]=$f
                done
                $S3_MP_BIN -f $QUIET -np $S3_THREADS -s $CHUNKSIZE $i s3://$S3_BUCKET/$f &
        fi
        CNT=$(($CNT + 1))
done

Here’s a copy of s3-mp-upload.py (patched to use the encryption header):

#!/usr/bin/env python
import argparse
from cStringIO import StringIO
import logging
from math import ceil
from multiprocessing import Pool
import time
import urlparse

import boto

parser = argparse.ArgumentParser(description="Transfer large files to S3",
        prog="s3-mp-upload")
parser.add_argument("src", type=file, help="The file to transfer")
parser.add_argument("dest", help="The S3 destination object")
parser.add_argument("-np", "--num-processes", help="Number of processors to use",
        type=int, default=2)
parser.add_argument("-f", "--force", help="Overwrite an existing S3 key",
        action="store_true")
parser.add_argument("-s", "--split", help="Split size, in Mb", type=int, default=50)
parser.add_argument("-rrs", "--reduced-redundancy", help="Use reduced redundancy storage. Default is standard.", default=False,  action="store_true")
parser.add_argument("-v", "--verbose", help="Be more verbose", default=False, action="store_true")
parser.add_argument("-q", "--quiet", help="Be less verbose (for use in cron jobs)", default=False, action="store_true")

logger = logging.getLogger("s3-mp-upload")

def do_part_upload(args):
    """
    Upload a part of a MultiPartUpload

    Open the target file and read in a chunk. Since we can't pickle
    S3Connection or MultiPartUpload objects, we have to reconnect and lookup
    the MPU object with each part upload.

    :type args: tuple of (string, string, string, int, int, int)
    :param args: The actual arguments of this method. Due to lameness of
                 multiprocessing, we have to extract these outside of the
                 function definition.

                 The arguments are: S3 Bucket name, MultiPartUpload id, file
                 name, the part number, part offset, part size
    """
    # Multiprocessing args lameness
    bucket_name, mpu_id, fname, i, start, size = args
    logger.debug("do_part_upload got args: %s" % (args,))

    # Connect to S3, get the MultiPartUpload
    s3 = boto.connect_s3()
    bucket = s3.lookup(bucket_name)
    mpu = None
    for mp in bucket.list_multipart_uploads():
        if mp.id == mpu_id:
            mpu = mp
            break
    if mpu is None:
        raise Exception("Could not find MultiPartUpload %s" % mpu_id)

    # Read the chunk from the file
    fp = open(fname, 'rb')
    fp.seek(start)
    data = fp.read(size)
    fp.close()
    if not data:
        raise Exception("Unexpectedly tried to read an empty chunk")

    def progress(x,y):
        logger.debug("Part %d: %0.2f%%" % (i+1, 100.*x/y))

    # Do the upload
    t1 = time.time()
    mpu.upload_part_from_file(StringIO(data), i+1, cb=progress)

    # Print some timings
    t2 = time.time() - t1
    s = len(data)/1024./1024.
    logger.info("Uploaded part %s (%0.2fM) in %0.2fs at %0.2fMbps" % (i+1, s, t2, s/t2))

def main(src, dest, num_processes=8, split=50, force=False, reduced_redundancy=False, verbose=False, quiet=False):
    # Check that dest is a valid S3 url
    split_rs = urlparse.urlsplit(dest)
    if split_rs.scheme != "s3":
        raise ValueError("'%s' is not an S3 url" % dest)

    s3 = boto.connect_s3()
    bucket = s3.lookup(split_rs.netloc)
    key = bucket.get_key(split_rs.path)
    # See if we're overwriting an existing key
    if key is not None:
        if not force:
            raise ValueError("'%s' already exists. Specify -f to overwrite it" % dest)

    # Determine the splits
    part_size = max(5*1024*1024, 1024*1024*split)
    src.seek(0,2)
    size = src.tell()
    num_parts = int(ceil(size / part_size))

    # If file is less than 5M, just upload it directly
    if size < 5*1024*1024:
        src.seek(0)
        t1 = time.time()
        k = boto.s3.key.Key(bucket,split_rs.path)
        k.set_contents_from_file(src, encrypt_key=True)
        t2 = time.time() - t1
        s = size/1024./1024.
        logger.info("Finished uploading %0.2fM in %0.2fs (%0.2fMbps)" % (s, t2, s/t2))
        return

    # Create the multi-part upload object
    mpu = bucket.initiate_multipart_upload(split_rs.path, reduced_redundancy=reduced_redundancy, encrypt_key=True)
    logger.info("Initialized upload: %s" % mpu.id)

    # Generate arguments for invocations of do_part_upload
    def gen_args(num_parts, fold_last):
        for i in range(num_parts+1):
            part_start = part_size*i
            if i == (num_parts-1) and fold_last is True:
                yield (bucket.name, mpu.id, src.name, i, part_start, part_size*2)
                break
            else:
                yield (bucket.name, mpu.id, src.name, i, part_start, part_size)


    # If the last part is less than 5M, just fold it into the previous part
    fold_last = ((size % part_size) < 5*1024*1024)

    # Do the thing
    try:
        # Create a pool of workers
        pool = Pool(processes=num_processes)
        t1 = time.time()
        pool.map_async(do_part_upload, gen_args(num_parts, fold_last)).get(9999999)
        # Print out some timings
        t2 = time.time() - t1
        s = size/1024./1024.
        # Finalize
        src.close()
        mpu.complete_upload()
        logger.info("Finished uploading %0.2fM in %0.2fs (%0.2fMbps)" % (s, t2, s/t2))
    except KeyboardInterrupt:
        logger.warn("Received KeyboardInterrupt, canceling upload")
        pool.terminate()
        mpu.cancel_upload()
    except Exception, err:
        logger.error("Encountered an error, canceling upload")
        logger.error(err)
        mpu.cancel_upload()

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    args = parser.parse_args()
    arg_dict = vars(args)
    if arg_dict['quiet'] == True:
        logger.setLevel(logging.WARNING)
    if arg_dict['verbose'] == True:
        logger.setLevel(logging.DEBUG)
    logger.debug("CLI args: %s" % args)
    main(**arg_dict)
Posted in Uncategorized | Leave a comment

Installing Hadoop 1.0.x-stable On 10+ Nodes With HBase, ZooKeeper, Thrift, HappyBase – Part 2!

We left off setting up the NameNode for HDFS. Let’s configure/install the JobTracker (www4) and rest of the slaves www5-www15.

  • Set up the HDFS unix user on each node with the same gid/uid; In our case it is id 509 groupadd -g 509 hdfs ; useradd -u 509 -g 509 hdfs
  • Install java as before as root yum install java-1.7.0-openjdk java-1.7.0-openjdk-devel
  • wget http://apache.petsads.us/hadoop/common/stable/hadoop-1.0.4.tar.gz
  • tar -C /usr/local -zxvf hadoop-1.0.4.tar.gz
  • ln -s /usr/local/hadoop-1.0.4 /usr/local/hadoop
  • mkdir /usr/local/hadoop/namenode ; mkdir /usr/local/hadoop/datanode
  • mkdir -p /var/hadoop/temp ; chown hdfs /var/hadoop/temp
  • chown -R hdfs /usr/local/hadoop-1.0.4
  • Copy the config files modified in part 1 over… from the master node: scp /usr/local/hadoop/conf/* hdfs@www4:/usr/local/hadoop/conf/
  • Put in your ~/.bashrc / rc file your JAVA_HOME … export JAVA_HOME=/usr/lib/jvm/java

Download HBase and get started with its installation:

  • wget http://apache.mesi.com.ar/hbase/stable/hbase-0.94.4.tar.gz
  • tar -C /usr/local -zxvf hbase-0.94.4.tar.gz
  • ln -s /usr/local/hbase-0.94.4 /usr/local/hbase
  • edit /usr/local/hbase/conf/hbase-site.xml:
    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>hdfs://www3:8020/hbase</value>
        <description>The directory shared by RegionServers.
        </description>
      </property>
      <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
      </property>
      <property>
        <name>hbase.zookeeper.quorum</name>
        <value>www3,www5,www7,www9,www11</value>
      </property>
      <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/usr/local/hbase/zookeeper</value>
      </property>
    </configuration>
    
  • start-hbase.sh
Posted in Uncategorized | Leave a comment

Installing Hadoop 1.0.x-stable On 10+ Nodes With HBase, ZooKeeper, Thrift, HappyBase – Part 1!

For more information about Hadoop, please watch https://www.youtube.com/watch?v=d2xeNpfzsYI. HBase on top of Hadoop provides powerful, extremely high throughput (Hadoop HDFS) with secondary indexing, automatic sharding of data, and Map-Reduce. I’m going to try to keep this guide as simple as possible for our future reference. I hope you find it useful!

The hardware topology in our case will be 13 Dell R420 1RU webservers, each with 143GB SSD RAID 1 drives and 2 x Intel E5-2450 @ 2.10GHz CPUs (20MB Cache), 16 cores (32 with HT), with 32GB of RAM. These servers are in 2 racks connected by 2 x 1gig Foundry switching split into 2 vlans – frontend and backend.

N.B., Hadoop 1.0.x-stable requires one NameNode for filesystem metadata (see Hadoop 2.0 for NameNode HA and HDFS Federation), at least three Datanodes (replica level by default is set to 3), exactly one JobTracker, and many TaskTrackers.

Footnote: In Hadoop 2.0.x, the Map-Reduce JobTracker has been split into two components: the ResourceManager and ApplicationMaster. Apache mentions that, “the new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application‚ scheduling and coordination. An application is either a single job in the sense of classic MapReduce jobs or a DAG of such jobs. The ResourceManager and per-machine NodeManager daemon, which manages the user processes on that machine, form the computation fabric. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.” For more information see this YARN document.

In our case, we will set up 1 master NameNode with hostname www3 and 1 master JobTracker with hostname www4 for Map/Reduce jobs. The rest of the servers will be slaves, and as such will be DataNodes with TaskTrackers. These will have hostnames www5 through www15. www2 will be a hot-spare of www3 in the event of system failure. We achieve this by specifying a secondary NFS path for dfs.name.dir, which will be mounted on our failover server and replayed in the event of www4 failure. The operating systems on these servers are all CentOS 6.x x86_64.

  • Create a new Unix user hdfs that will be for HDFS daemon operations on each node: useradd hdfs
  • Download Hadoop 1.0.x stable. In our case, we get from a mirror: wget http://apache.petsads.us/hadoop/common/stable/hadoop-1.0.4.tar.gz.
  • Extract the directory to /usr/local: tar -C /usr/local -zxvf hadoop-1.0.4.tar.gz then ln -s /usr/local/hadoop-1.0.4 /usr/local/hadoop. Then make sure it is owned by the hdfs user: chown -R hdfs /usr/local/hadoop-1.0.4
  • Make sure java is installed. yum install java-1.7.0-openjdk java-1.7.0-openjdk-devel
  • First we set up the NameNode on www3. Open /usr/local/hadoop/conf/core-site.xml:
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
    <!-- URI of NameNode (master metadata HDFS node) -->
        <name>fs.default.name</name>
        <value>hdfs://www3/</value>
      </property>
      <property>
        <name>fs.inmemory.size.mb</name>
        <value>200</value>
        <!--Larger amount of memory allocated for the in-memory file-system used to merge map-outputs at the reduces. -->
      </property>
      <property>
        <name>io.sort.factor</name>
        <value>100</value>
      </property>
      <property>
        <name>io.sort.mb</name>
        <value>200</value>
      </property>
      <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
      </property>
    </configuration>
    
  • Next create a folder for Namenode metadata and mapreduce temp data: mkdir /usr/local/hadoop/namenode ; mkdir /usr/local/hadoop/datanode; mkdir -p /var/hadoop/temp ; chown hdfs /usr/local/hadoop/namenode; chown hdfs /usr/local/hadoop/datanode; chown -R hdfs /var/hadoop/temp. Edit /usr/local/hadoop/conf/hdfs-site.xml:
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>dfs.name.dir</name>
        <value>/usr/local/hadoop/namenode,/nfs_storage/hadoop/namenode_backup</value>
      </property>
      <property>
        <name>dfs.block.size</name>
        <!-- 128MB -->
        <value>134217728</value>
      </property>
      <property>
        <name>dfs.namenode.handler.count</name>
        <!-- # RPC threads from datanodes -->
        <value>40</value>
      </property>
      <property>
        <name>dfs.data.dir</name>
        <value>/usr/local/hadoop/datanode</value>
      </property>
      <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>4096</value>
      </property>
      <property>
        <name>dfs.support.append</name>
        <value>true</value>
      </property>
    </configuration>
    
  • Next modify /usr/local/hadoop/conf/mapred-site.xml:
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>mapred.job.tracker</name>
        <value>www4:8021</value>
      </property>
      <property>
        <name>mapred.system.dir</name>
        <value>/hadoop/mapred/system/</value>
      </property>
      <property>
        <name>mapred.local.dir</name>
        <value>/var/hadoop/temp</value>
      </property>
      <property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>20</value>
      </property>
      <property>
        <name>mapred.tasktracker.reduce.tasks.maximum</name>
        <value>20</value>
      </property>
      <property>
        <name>mapred.queue.names</name>
        <value>default,rooms</name>
      </property>
      <property>
        <name>mapred.acls.enabled</name>
        <value>false</value>
      </property>
      <property>
        <name>mapred.reduce.parallel.copies</name>
        <value>20</value>
        <!--Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.-->
      </property>
      <property>
        <name>mapred.map.child.java.opts</name>
        <value>-Xmx512M</value>
      </property>
      <property>
        <name>mapred.reduce.child.java.opts</name>
        <value>-Xmx512M</value>
      </property>
      <property>
        <name>mapred.task.tracker.task-controller</name>
        <value>org.apache.hadoop.mapred.DefaultTaskController</value>
      </property>
    </configuration>
  • Next modify your slaves and masters files /usr/local/hadoop/conf/slaves and /usr/local/hadoop/conf/masters:
    [root@www3 conf]# cat masters
    www3
    www4
    
    [root@www3 conf]# cat slaves
    www5
    www6
    www7
    www8
    www9
    www10
    www11
    www12
    www13
    www14
    www15
    
  • We now need to set up your environment on the NameNode www3. Open /usr/local/hadoop/conf/hadoop-env.sh:
    export HADOOP_NODENAME_OPTS="-XX:+UseParallelGC ${HADOOP_NODENAME_OPTS}"
    export HADOOP_HEAPSIZE="1000"
    export JAVA_HOME=/usr/lib/jvm/java
    

    Also you will want to put

    export JAVA_HOME=/usr/lib/jvm/java

    in your ~/.bashrc

After www3 (NameNode master server) has been set up as follows, we just have to copy the conf files over to www4 and the other nodes then fire up hadoop. Here are the steps after syncing the conf files and making the appropriate (meta)data directories like /usr/local/hadoop/namenode or /usr/local/hadoop/datanode.

  • To start a Hadoop cluster you will need to start both the HDFS and Map/Reduce cluster. Format a new distributed filesystem:
    $ bin/hadoop namenode -format
  • You will get output like:
    [hdfs@www3 hadoop]$ bin/hadoop namenode -format
    12/12/29 02:00:05 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = www3/10.23.23.12
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 1.0.4
    STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
    ************************************************************/
    Re-format filesystem in /usr/local/hadoop/namenode ? (Y or N) 
    2/12/29 02:15:39 INFO util.GSet: VM type       = 64-bit
    12/12/29 02:15:39 INFO util.GSet: 2% max memory = 17.77875 MB
    12/12/29 02:15:39 INFO util.GSet: capacity      = 2^21 = 2097152 entries
    12/12/29 02:15:39 INFO util.GSet: recommended=2097152, actual=2097152
    12/12/29 02:15:40 INFO namenode.FSNamesystem: fsOwner=hdfs
    12/12/29 02:15:40 INFO namenode.FSNamesystem: supergroup=supergroup
    12/12/29 02:15:40 INFO namenode.FSNamesystem: isPermissionEnabled=true
    12/12/29 02:15:40 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
    12/12/29 02:15:40 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
    12/12/29 02:15:40 INFO namenode.NameNode: Caching file names occuring more than 10 times 
    12/12/29 02:15:40 INFO common.Storage: Image file of size 110 saved in 0 seconds.
    12/12/29 02:15:40 INFO common.Storage: Storage directory /usr/local/hadoop/namenode has been successfully formatted.
    12/12/29 02:15:40 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at www3/10.23.23.12
    ************************************************************/
    
    
  • The hdfs user should have ssh (pubkey) access to all the slaves as well as to itself from the nameNode and the jobTracker node. Here we create the keys for each (ssh into each nameNode and jobTracker to do this) and add to all ssh authorized_hosts files; sudo su - hdfs ; ssh-keygen -t rsa ; cat ~/.ssh/id_rsa.pub Copy that public key to the ~hdfs/.ssh/authorized_keys file on the slaves and the master NameNode.
  • Start the HDFS with the following command *** after the slaves are configured/installed *** – see Part 2 and then run on the designated NameNode:
    $ bin/start-dfs.sh
  • Start the jobTracker node ***after part2!***
    $ bin/start-mapred.sh
Posted in Uncategorized | Leave a comment

Arch Linux on Mac Pro with Luks Full Disk Encryption

You need: OS X Install CD (I used a copy of Snow Leopard), Arch Linux net install CD

1. Boot into Arch. Wipe hard drive, for example with #badblocks -c 10240 -wsvt random /dev/sda

2. #parted , set partition type to “msdos” (instead of “GPT”)

3. #cfdisk /dev/sda , add /dev/sda1 size 1024M , bootable, type 83 (linux) (for /boot); add /dev/sda2 size 8024M, type 82 (swap); add /dev/sda3 size XXXG , type 83 (linux) for / . Save partition table.

4. Reboot into OS X install DVD. Open terminal. #bless --device /dev/disk0sX --setBoot --legacy --verbose …. where “X” is the number for your /boot partition you created. You can find it by doing #diskutil list . Now your mac is configured to boot from /boot .

5. Boot into Arch CD. Set up encrypted swap volume: #cryptsetup -c aes-xts-plain -s 512 -h sha512 -v luksFormat /dev/sda2 (put your passphrase in)… Set up encrypted root volume: #cryptsetup -c aes-xts-plain -y -s 512 luksFormat /dev/sda3 (put your passphrase in)

6. Set up mappers for crypt volumes in /dev: #cryptsetup luksOpen /dev/sda3 root … #cryptsetup luksOpen /dev/sda2 swapDevice

7. Create swap on swapDevice: #mkswap /dev/mapper/swapDevice

8. edit /lib/initcpio/hooks/openswap:

# vim: set ft=sh:
run_hook ()
{
cryptsetup luksOpen /dev/sda2 swapDevice
}

9. edit /lib/initcpio/install/openswap:

# vim: set ft=sh:
build ()
{
MODULES=""
BINARIES=""
FILES=""
SCRIPT="openswap"
}
help ()
{
cat <<HELPEOF
This opens the encrypted swap partition /dev/sda2 on swapDevice mapper.
HELPEOF
}

10. Edit /etc/mkinitcpio.conf; add “openswap” before “filesystems” but after “encrypt”. Add “resume” between “openswap” and “filesystems”. Should look something like this: HOOKS="base udev autodetect pata scsi sata usb usbinput keymap encrypt openswap resume filesystems"

11. Install arch with /arch/setup , and use /dev/mapper/root for / and /dev/mapper/swapDevice for your swap . Use /dev/sda1 for /boot . I use XFS for / .

12. When it comes time to modify the config files after installing packages, modify mkinitcpio.conf like above. Also set MODULES=”xfs” for whatever filesystem you used for /.

13. When it comes time to set up the bootloader grub, make this change to your grub config: kernel /vmlinuz-linux cryptdevice=/dev/sda3:root root=/dev/mapper/root resume=/dev/mapper/swapDevice ro

14. Install grub on /dev/sda

15. reboot. You’ll be asked twice for your luks keyphrase.. one for swap and one for root.

16. #pacman –sync –refresh

17. #pacman -Syu xorg-server
18. #pacman -S xorg-xinit xterm fluxbox xorg-utils xorg-server-utils xf86-video-ati chromium mesa-demos artwiz-fonts bdf-unifont cantarell-fonts font-bitstream-speedo font-misc-ethiopic font-misc-meltho ftgl gsfonts xpdf xv gv ttf-cheapskate ttf-bitstream-vera ttf-freefont ttf-linux-libertine xorg-xlsfonts

18a. edit ~/.xinitrc , make it executable.

#!/bin/sh

xset +fp /usr/share/fonts/local
xset fp rehash
nitrogen -restore
dropbox start
pidgin
exec startfluxbox

19. edit /etc/pacman.conf and uncomment the multilib section for 32bit support . #pacman –sync –refresh

19a. edit /etc/inittab … make SLIM do the login instead of text login .. by uncommenting x:5:respawn:/usr/bin/slim >/dev/null 2>&1 and commenting the other xdm line. Change id:5:initdefault: to be uncommented, comment out init level 3 initdefault. edit slim themes by editing /etc/slim.conf change current_theme line to whatever : current_theme fingerprint,default,rear-window,subway,wave,lake,flat,capernoited … for example.

20. #pacman -S dina-font font-mathematica terminus-font profont zsh vim slim slim-themes archlinux-themes-slim alsa-utils alsa-tools alsamixer thunderbird pidgin gtk-theme-switch2 gtk-engines nautilus rsync gnupg irssi flashplugin nitrogen xlockmore dnsutils wget glib wine gpgme cups ntp skype vpnc scrot xclip … etc etc

To update your system from time to time… do pacman -Syu

Posted in Uncategorized | Leave a comment

Cron to Re-Sign DNSSEC Zones Nightly


#!/bin/bash
# this script uses sign_zone.sh from a previous blog post:
# http://packetcloud.net/2011/10/13/script-to-easy-nsec3rsasha1-sign-dnssec-zones/

cd /var/named/dynamic/
for dir in `find . -mindepth 1 -maxdepth 1 -type d -print`; do
domain=`basename $dir`
cd $dir
/usr/local/bin/sign_zone.sh $domain
done

/sbin/service named restart

Posted in Uncategorized | Leave a comment

Script to Generate NSEC3 KSK and ZSK for DNSSEC

Here’s my script to generate keys in /var/named/dynamic for DNSSEC. Usage: generate_keys <domain>


#!/bin/bash
#this script is /usr/local/bin/generate_keys
domain=$1

cd /var/named/dynamic/

dnssec-keygen -r /dev/urandom -3 $domain
dnssec-keygen -r /dev/urandom -3 -fk $domain

chown named:named /var/named/dynamic/K*

mkdir /var/named/dynamic/$domain
chown named:named /var/named/dynamic/$domain

Posted in Uncategorized | Leave a comment

Script to Easy-NSEC3RSASHA1 Sign DNSSEC Zones

DNSSEC has a lot of commands to learn and type when maintaining your system. Hopefully this simplifies it for you. Usage: sign_zone.sh <domain>. I verified this working with Bind 9.7.3 on Amazon EC2 and also with Bind 9.7.0 on CentOS using the bind97 RPMs and chroot jail. I store my ZSK and KSK for all domains in /var/named/dynamic. Then I have each zone in a subfolder /var/named/dynamic/<domain>. /etc/named.conf is configured to look for the generated <domain>.signed file. It will automatically increment the serial number for the zone then resign. I have a separate script to run this every night on a cron.


#!/bin/bash
#this file is /usr/local/bin/sign_zone.sh
domain=$1
nsec3_salt=`/usr/local/bin/random_salt`

cd /var/named/dynamic/$domain

ZSK=`grep -iH 'zone' ../K${domain}.*key | cut -d':' -f1`
KSK=`grep -iH 'key-sign' ../K${domain}.*key | cut -d':' -f1`

SOA_SERIAL=`grep serial $domain | sed -e 's/^[ \t]*//g' | awk '{print $1}'`
NEW_SERIAL=`expr $SOA_SERIAL + 1`

echo "detected SOA SERIAL: $SOA_SERIAL"
echo "generating a new zone with NEW SOA SERIAL: $NEW_SERIAL"

cat $domain | sed -e "s/[0-9][0-9]*.*;.*serial/${NEW_SERIAL} ; serial/" > $domain.new
cp $domain.new $domain

echo "detected ZSK: $ZSK"
echo "detected KSK: $KSK"
echo "running signzone..."
echo dnssec-signzone -3 $nsec3_salt -a -S -k $KSK $domain $ZSK
dnssec-signzone -3 $nsec3_salt -a -S -k $KSK $domain $ZSK

Code to make a random salt for above:

#!/bin/bash
# save this file as /usr/local/bin/random_salt
dd if=/dev/urandom bs=16 count=1 2>/dev/null | hexdump -e \"%08x\"

Posted in Uncategorized | Tagged | Leave a comment

Screenshot Script to Put Image In Dropbox – Puts URL in Clipboard – For *nix

Here’s are two scripts to do what GrabBox does on OS X – they take screenshots and then place each in your public dropbox folder (edit the url to fit your id where xxxxxxx is). The first script snap.sh does a full screenshot. The second lets you click on a window, and that window will be grabbed. I set up some fluxbox key bindings to call these scripts. The url will be injected into your clipboard using xclip for quick paste in an email or in IRC :-)

snap.sh:

#!/bin/bash
RAND=`cat /dev/urandom| tr -dc 'a-zA-Z0-9' | fold -w 24 | head -n 1`
IMAGE=${RAND}.png

scrot -d 0 -q 10 ~/$IMAGE

cp ~/$IMAGE ~/Dropbox/Public/Screenshots/${IMAGE}
echo "http://dl.dropbox.com/u/xxxxxxxx/Screenshots/${IMAGE}" | xclip
rm -f ~/$IMAGE
#http://dl.dropbox.com/u/xxxxxxx/Screenshots/2vaq%7Exr5lr6x.png

snap2.sh:

#!/bin/bash

RAND=`cat /dev/urandom| tr -dc 'a-zA-Z0-9' | fold -w 24 | head -n 1`
IMAGE=${RAND}.png

scrot -d 0 -b -s -q 10 ~/$IMAGE

cp ~/$IMAGE ~/Dropbox/Public/Screenshots/${IMAGE}
echo "http://dl.dropbox.com/u/xxxxxxx/Screenshots/${IMAGE}" | xclip
rm -f ~/$IMAGE

Posted in Uncategorized | Leave a comment

Upgrade from FreeBSD 8.1 to 8.2 REL

1. cd /usr/src
2. edit supfile to use RELENG_8_2
*default host=cvsup15.FreeBSD.org
*default tag=RELENG_8_2
*default prefix=/usr
*default base=/var/db
*default release=cvs delete use-rel-suffix
src-all
3. cvsup -g -L 2 supfile
4. cd /usr/obj
5. chflags -R noschg *
6. rm -rf *
7. cd /usr/src && mergemaster -p
8. make buildworld

8.5. Make sure to edit your /root/kernels/PACKETCLOUD or whatever kernel config file you have

9. make buildkernel KERNCONF=PACKETCLOUD
10. make installkernel KERNCONF=PACKETCLOUD
11. mergemaster -p
12. make installworld
13. mergemaster
14. reboot
15. redo your ports…. cd /usr/ports …. portsnap fetch …. portsnap update
16. pkg_version -v | grep -v =
17. portupgrade -ra
18. etc etc. as usual

Posted in Uncategorized | 1 Comment