Friday, April 16, 2010

Oracle 10g Rac Installation on AIX system (cookbook)

This installation refer to  Oracle 10.2.0.4  +AIX 6.1  Rac  installation.We have used GPFS 3.2  but this document  does not cover  GPFS installation
  1. Take  2 virtiual IP and DNS (This IP 's will be attaced to the machines during the CRS installation do not attach this IP by yourself )

    10.200.20.229 cbdst1-vip
    10.200.20.230 cbdst2-vip
  2. Check Hardware
    Check hardware bitmode The expected output is the value 64.
    /usr/bin/getconf HARDWARE_BITMODE
    Check hostname Before setting the default hostname of each server, make sure to match it with the RAC Public node
    hostname check memory (Minumum 512m but for me  For test > 4g for prod > 8g )
    lsattr -El sys0 -a realmem



    Check Software installation internal disk for oracle code It must be >12g (CRS_HOME, ASM_HOME,ORACLE_HOME) also plan the other procdures written by DBA

    check paging Paging space = 2 x RAM
    lsps -a
    Ckeck /tmp It must be >400m 
    df -k /tmp

    check oslevel and compabilty.It must be the  same  for  both side  5.2,5.3  and  6.1  is  certificated  with 6.1  (To have the latest information please refer to Metalink Note 282036.1 )
    oslevel -s
    check Filesets

    lslpp -l bos.adt.base
    lslpp -l bos.adt.lib
    lslpp -l bos.adt.libm
    lslpp -l bos.perf.libperfstat
    lslpp -l bos.perf.perfstat
    lslpp -l bos.perf.proctools
    lslpp -l rsct.basic.rte
    lslpp -l rsct.compat.clients.rte
    lslpp -l xlC.aix61.rte   #This  can be xlC.aix50.rte for 5.3  and  version must 7.0.0.4 or 8.xxx
    lslpp -l xlC.rte

    check fixes


    For 5.3

    /usr/sbin/instfix -i -k "IY68989 IY68874 IY70031 IY76140 IY89080"
    For 6.1
    /usr/sbin/instfix -i -k "IZ10223"
  3. Check software user and group on both nodes
    id  cbdst  uid=109(cbdst) gid=109(dbat) groups=1(staff)

    cat /etc/group|grep cbdst
  4. Configure  kernel  parameters

    Configure Shell Limits for cbdst and root user
    ulimit -a
    time(seconds)            unlimited
    file(blocks)                  unlimited
    data(kbytes)               unlimited
    stack(kbytes)              4194304
    memory(kbytes)         unlimited
    coredump(blocks)      unlimited
    nofiles(descriptors)    unlimited
    threads(per process) unlimited
    processes(per user)  unlimited

    Configure System Configuration Parameters
    lsattr -El sys0 -a maxuproc                    #Cheking It must be 4096
    chdev -l sys0 -a maxuproc='4096'        #Setting

    Verify that the lru_file_repage parameter is set to 0 (Run it with root user )
    vmo -L lru_file_repage                                           #check
    vmo -p -o lru_file_repage=0                                  #set

    Setting Asynchronous I/O (from smitty aio for AIX 5.3  for AIX 6.1  could not be set)
    lsattr -El aio0         #check for AIX 5.3  nothing  for AIX 6.1

    Configure Network Tuning Parameters
    for i in ipqmaxlen rfc1323 sb_max tcp_recvspace tcp_sendspace udp_recvspace udp_sendspace
    do
        no -a |grep $i
    done

    ipqmaxlen = 512

    rfc1323 = 1
    sb_max = 1310720
    tcp_recvspace = 65536
    tcp_sendspace = 65536
    udp_recvspace = 655360
    udp_sendspace = 65536
  5. Network Identfication

    Ping nodes from each node
    ping santaro     10.200.20.29

    ping pandora   10.200.20.30

    ifconfig -l       #check for each node
    ifconfig -a    

    check for each node  and  similiar  IP must be defined  the same cards e.x for pandora
    en7:Public network 
    en6:Private   network
    en5:Backup network 

    en5: flags=1e080863,480
    inet 10.200.96.30 netmask 0xffffff00 broadcast 10.200.96.255
    tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1

    en6: flags=1e080863,480
    inet 7.0.0.2 netmask 0xffffff00 broadcast 7.0.0.255
    tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1

    en7: flags=1e080863,480
    inet 10.200.20.30 netmask 0xffffff00 broadcast 10.200.20.255
    tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
    lo0: flags=e08084b
    inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
    inet6 ::1/0
    tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1

    for i in `ifconfig -l`
    do
       echo $i
       for attribut in netaddr netmask broadcast state
       do
           lsattr -El $i -a $attribut
      done
    done
  6. Plan the IP 's ( Virtual,GPFS ,ODM IP's ) modify in /etc/hosts
    #Public network for Oracle RAC
    10.200.20.29 santaro
    10.200.20.30 pandora

    #Virtual IP addresses for Oracle RAC
    10.200.20.229 cbdst1-vip
    10.200.20.230 cbdst2-vip

    #Private(Intercennect) for Oracle RAC
    7.0.0.1 santaro-priv
    7.0.0.2 pandora-priv

    #Interconnect for GPFS
    7.0.0.1 santaro-rac
    7.0.0.2 pandora-rac

    #IP for ODM network IP the same
    10.200.96.29 santaro-bkp
    10.200.96.29 pandora-bkp
  7. Check Default gateway on public interface
    netstat -r |grep default 
  8. User Equivalence Setup (It can rsh or ssh implementation we will implemant ssh ) ssh-keygen may be installted on diffrent  path

    on first node run it set by step continue with enter (santaro)
    mkdir ~/.ssh
    chmod 700 ~/.ssh
    /usr/local/bin/ssh-keygen -t rsa
    /usr/local/bin/ssh-keygen -t dsa
    touch ~/.ssh/authorized_keys
    cd ~/.ssh
    cat $HOME/.ssh/id_rsa.pub>>$HOME/.ssh/authorized_keys
    cat $HOME/.ssh/id_dsa.pub>>$HOME/.ssh/authorized_keys


    on second node run it set by step and continue with enter (pandora)
    mkdir ~/.ssh
    chmod 700 ~/.ssh
    /usr/local/bin/ssh-keygen -t rsa
    /usr/local/bin/ssh-keygen -t dsa
    touch ~/.ssh/authorized_keys
    cd ~/.ssh
    cat $HOME/.ssh/id_rsa.pub>>$HOME/.ssh/authorized_keys
    cat $HOME/.ssh/id_dsa.pub>>$HOME/.ssh/authorized_keys

    copy the authorized_keys from firts node (on pandora )
    rcp cbdst@santaro:$HOME/.ssh/authorized_keys $HOME/.ssh/authorized_keys.node1

    Join authorized_keys files from both node (on santaro )
    cat $HOME/.ssh/authorized_keys.node1 >>$HOME/.ssh/authorized_keys
    rm $HOME/.ssh/authorized_keys.node1

    Copy the joined file to other node1 (on pandora)
    rcp $HOME/.ssh/authorized_keys cbdst@santaro:$HOME/.ssh/authorized_keys

    run this command on both node
    chmod 600 ~/.ssh/authorized_keys

    Run all this command 2 times on both node as root and software user(cbdst)  it will ask ask question for first run

    ssh santaro date
    ssh pandora date
  9. Preventing Oracle Clusterware Installation Errors Caused by stty Commands put in .profile as root and software user

    if [ -t 0 ]; then

    stty intr ^C
    fi
  10. edit .profile  and  rerun (. .profile )
    if [ -t 0 ]; then

    stty intr ^C
    fi

    export ORACLE_BASE=/cbdsthome/oracle/app/oracle
    export ORA_CRS_HOME=/cbdsthome/crshome
    export CRS_HOME=$ORA_CRS_HOME
    export ORACLE_HOME=/cbdsthome/oracle/app/oracle/product/10g
    export ORA_NLS10=$ORACLE_HOME/nls/data
    export ORACLE_OWNER=cbdst
    export ORACLE_SID=NGBSTEST1   #This  will be NGBSTEST2  for  other node


    export LD_LIBRARY_PATH=PATH=$ORACLE_HOME/lib:$CRS_HOME/lib:$ORACLE_HOME/lib32:$CRS_HOME/lib32:/usr/ccs/lib:/usr/ucblib:/usr/java/lib:$LD_LIBRARY_PATH
    export PATH=$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:/usr/sbin:/usr/bin:/bin:/usr/ccs/bin:/usr/ucb:/usr/local/bin:/opt/gnu/bin:.:/usr/bin/X11:$ORACLE_HOME/OPatch
    export AIXTHREAD_SCOPE=S
    export LIBPATH=$LD_LIBRARY_PATH

    export ORA_SQLDBA_MODE=line
    export NLS_LANG=Turkish_Turkey.WE8ISO8859P9
    export NLS_NUMERIC_CHARACTERS='.,'


    export TERM=vt100
    export EDITOR=vi
    set -o vi
    stty erase ^? 1>/dev/null 2>&1
    umask 022

    export TEMP=/tmp
    export TMP=/tmp
    export TMPDIR=/tmp
  11. Check all disks

    for l in `lspv |awk {'print $1'}` do
      lscfg -vl $l |grep hdisk
    done
  12. Check if the same disks corresponds the same (This is for 8300)
    node1='santaro'
    node2='pandora'
    for l in `lspv |grep -v rootvg|awk {'print $1'}`
    do
     node1_lun_id=`ssh $node1 lscfg -vl $l|grep "Serial Number"|sed 's/ Serial Number//g'|sed 's/\.//g'`
    node2_lun_id=`ssh $node2 lscfg -vl $l|grep "Serial Number"|sed 's/ Serial Number//g'|sed 's/\.//g'`
    if [ $node1_lun_id == $node2_lun_id ]
    then
      echo "$l node1_lun_id=$node1_lun_id node2_lun_id=$node2_lun_id Equallll"
    else
      echo "$l node1_lun_id=$node1_lun_id node2_lun_id=$node2_lun_id  not Equallll"
    fi
    done
  13. Check ocr and voting disks They must correspond the same disk and reserve_policy must be no_reserve
  14. Control ocr and vote disks from both node  that they  correspends  the same  disks (e.x /dev/ocrdisk1)

    On node1 (santaro)
    santaro@cbdst:/cbdsthome/cbdst$ ls -lrt /dev/ocrdisk1
    crw-r----- 1 root dbat 19, 40 May 06 11:57 /dev/ocrdisk1

    santaro@cbdst:/cbdsthome/cbdst$ ls -lrt /dev/|grep "19, 40"
    crw------- 1 root system 19, 40 Apr 12 15:56 rhdisk4
    brw------- 1 root system 19, 40 Apr 12 15:56 hdisk4
    crw-r----- 1 root dbat 19, 40 May 06 11:57 ocrdisk1

    On node2(pandora)

    pandora@cbdst:/cbdsthome/cbdst$ ls -lrt /dev/ocrdisk1

    crw-r----- 1 root dbat 20, 42 Jun 09 09:41 /dev/ocrdisk1

    pandora@cbdst:/cbdsthome/cbdst$ ls -lrt /dev/|grep "20, 42"
    crw------- 1 root system 20, 42 Apr 12 15:56 rhdisk4
    brw------- 1 root system 20, 42 Apr 12 15:56 hdisk4
    crw-r----- 1 root dbat 20, 42 Jun 09 09:41 ocrdisk1

    We can see  that  on both node  /dev/ocrdisk1 correspands  the same (hdisk4) disk.Then check  all ocr and  voting  disks


    ls -lrt /dev/ocrdisk1
    ls -lrt /dev/ocrdisk2


    ls -lrt /dev/votedisk1
    ls -lrt /dev/votedisk2
    ls -lrt /dev/votedisk3
  15. Change ownership and permision of ocr and vote disks on both node as root

    chown cbdst:dbat /dev/votedisk1
    chown cbdst:dbat /dev/votedisk2
    chown cbdst:dbat /dev/votedisk3
    chown cbdst:dbat /dev/ocrdisk1
    chown cbdst:dbat /dev/ocrdisk2



    chmod 660 /dev/ocrdisk1
    chmod 660 /dev/ocrdisk2
    chmod 660 /dev/votedisk1
    chmod 660 /dev/votedisk2
    chmod 660 /dev/votedisk3
  16. Format OCR and Vote Disks as root on both node at the same time

    for i in 1 2
    do
      dd if=/dev/zero of=/dev/ocrdisk$i bs=8192 count=25000
    done

    for i in 1 2 3
    do
      dd if=/dev/zero of=/dev/votedisk$i bs=8192 count=25000
    done

     
  17. CRS Installation preparation
    1. Run root.pre.sh on both node as ROOT
      cd
      cd ./clusterware/rootpre
      ./rootpre.sh

      For AIX 6.1  download 6613550 (Because of bug ) and run under it
      cd <6613550>
      ./rootpre.sh
    2. check unzip on both node with the oracle software user
      which unzip 
    3. Run runcluvfy.sh as oracle software user and check cluvf.txt with the cookbook (only one node )
      cd
      cd ./clusterware/cluvfy
      ./runcluvfy.sh stage -pre crsinst -n santaro,pandora -verbose >/tmp/cluvf.txt

      On AIX 6.1  there is another  bug  so  result of runcluvfy.sh may be  uncompleted .
    4. Create a symbolic link from /usr/sbin/lsattr to /etc/lsattr on both node as ROOT
      ln -s /usr/sbin/lsattr /etc/lsattr
    5. OS Capabilities for CRS User to run Oracle Clusterware software on both node as ROOT

      /usr/sbin/lsuser -a capabilities cbdst
      /usr/bin/chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE,CAP_NUMA_ATTACH cbdst
      /usr/sbin/lsuser -a capabilities cbdst 

  18.  CRS installation  (Restarting both nodes can be usefull before installation)
    1. Run slibclean on both node as ROOT
      /usr/sbin/slibclean
    2. Start installer  on one node as  oracle  user  (software  is  not compatibale  with  aix 6.1  so  we  can  continue  if prerequieste reports  errors
      cd

      cd ./clusterware
      ./runInstaller
    3. Specify oraInventory,CRS_HOME and prereqiuset check  DO NOT install CRS_HOME in ORACLE_HOME  
    4. specify the cluster configration example
      specify the cluster name (cbdstcrs)

      edit and add nodes as

      Public Node Name:santaro
      Private Node Name:santaro-privs
      Virtual Host Name:cbdst1-vip


      Public Node Name:pandora
      Private Node Name:pandora-priv
      Virtual Host Name:cbdst2-vip
    5. Specify the network Interfaces example
      10.200.20.0 Public

      10.200.96.0 Do not use
      7.0.0.0 Private
    6. Specify OCR locations
      /dev/ocrdisk1

      /dev/ocrdisk2
    7. Specify Voting disk locations
      /dev/votedisk1

      /dev/votedisk2
      /dev/votedisk3
    8. Wait Until root scripts dialog box appears. (Bug 4437469 $ENTSTAT -d $_IF

      $GREP -iEq ".*lan state:.*operational.*") DO NOT run root.sh before THIS STEP on NOTH NODES
      vi $CRS_HOME/bin/racgvip   #ON BOTH NODES  ROOT

      $ENTSTAT -d $_IF |$GREP -iEq ".*link.*status.*:.*up.*"
      replace
      $ENTSTAT -d $_IF |$GREP -iEq '.*lan.*state.*:.*operational.*|.*link.*status.*:.*up.*|.*port.*operational.*state.*:.*up.*'
    9. After root.sh on node2 It will give "The given interface(s), "en7" is not public. Public interfaces should be used to configure virtual IPs." run vipca AS ROOT ON FIRST NODE1 and fix it then click  ok

      connect to one node  as root  run vipca  under  $CRS_HOME/bin and  fix vip ip's  and  public interface  (in this example  en7,cbdst1-vip cbdst2-vip)
    10. Installation  is completed  then check th results 

      ifconfig -a #Check on both nodes virtual IP are taken by machines

      crsctl check crs
      crsctl query crs activeversion
      crs_stat -t #GSD,ONS,VIP applications must be online
      olsnodes #This should return all the nodes of the cluster
      oifcfg getif #check public and private networks
      ocrcheck #check ocr
      crsctl query css votedisk #check voting disk
      ocrconfig -export /cbdstdump/ocrdump.dmp -s online # Export Oracle Cluster Registry content as root
      ocrconfig -showbackup #check automatic backup period .First It can be null
    11. Change MISSCOUNT DEFINITION AND DEFAULT VALUES and restart nodes
      Keep only one node up and running, stop the others.close NODE1

      srvctl stop nodeapps -n santaro #as ROOT on santaro (node1)
      crsctl get css misscount #as ROOT on pandora (node2)
      crsctl set css misscount 30 #as ROOT on pandora(node2)
      srvctl start  nodeapps -n santaro #as ROOT on santaro (node1)
    12. Set racdiagwait (also we  can  follow Diagnosing Oracle Clusterware Node evictions (Diagwait))

      crsctl stop crs                                #as ROOT on both node
      $CRS_HOME/bin/oprocd stop  #as ROOT on both node
      ps -ef |egrep "crsd.bin|ocssd.bin|evmd.bin|oprocd"   #as ROOT on both node
      crsctl set css diagwait 13 -force                 #as ROOT on santaro(node1)
      crsctl get css diagwait                                #as ROOT on both node
      crsctl start crs                                              #as ROOT on both node
      crsctl check crs                                           #as ROOT on both node

  19.  CRS patch installation (e.x 10.2.0.4)
    1. Stop nodes by root  (e.x nodes  are santaro ,pandora )
      srvctl stop nodeapps -n santaro

      srvctl stop nodeapps -n pandora
    2. Stop crs on both node by root
      crsctl stop crs
    3. Take the backup of CRS_HOME and  OraInventory  on both node  as software user  with tar  command
    4. Execute following commands on both node by root (Beacuse of bug 6910119)
      chown -R cbdst $CRS_HOME/inventory/Templates/*
      chgrp -R dbat $CRS_HOME/inventory/Templates/*
    5. execute preupdate.sh script from $CRS_HOME/install on both node by root
      cd $CRS_HOME/install
      ./preupdate.sh -crshome $CRS_HOME -crsuser cbdst
    6. Run this on both node by root
      ln -s /usr/sbin/sync /usr/bin/sync
    7. Run slibclean on both node
      /usr/sbin/slibclean
    8. CD PATCH/Disk1 direcotory and runInstaller with software user
      ./runInstaller -ignoreSysPrereqs
    9. Choose CRS_HOME to patch and both nodes to patch
    10. You will have copy error you can ignore it if opatch shows okey after installation (Becasu of bug 6475472 )
    11. After running 2 scripts on both node (you may have VIP problems but they will be okey after bundle patch )
  20. Oracle Home  Install
    1. run /usr/sbin/slibclean ROOT on ALL NODES
    2. run install database
      cd
      cd ./database
      ./runInstaller
    3. specify ORACLE_HOME and path
      Select all nodes to install (not local installation )

      prereqiuset check may give error we can continue (If failed component is only bos.cifs_fs.rte,5.3.0.1) and it will give error for aix 6.1
      select "install database only "
  21. Oracle home  products  install  (like  companion ,client .etc)
  22. Oracle Home  patch  (e.x 10.2.04) install 
    1. run /usr/sbin/slibclean ROOT on ALL NODES
    2. run runInstaller
      cd
      ./runInstaller
  23. Find  the lastest opatch version and  apply it 
    cp p6880880_102000_AIX64-5L.zip $ORACLE_HOME

    cd $ORACLE_HOME
    mv OPatch OPatch_OLD
    unzip p6880880_102000_AIX64-5L.zip
  24. Read Note 405820.1 and find the latest Bundle Now CRS_HOME and  DATABASE_HOME  and  apply them  explained  as   their  readme
    9294403- CRS BUNDLE #4
    9352164 10.2.0.4 database patchset
  25. Now we can  create   database  by dbca

No comments: