Title: Better with Business Intelligence
1Better with Business Intelligence
- How to use the best of large application
development practices more effectively with
Business Intelligence
2Scott Currie
Chief Executive Officer
scott.currie_at_varigence.com Tel (800)
477-5260 Cell (425) 533-4228 Fax (866) 851-4230
3Resume in Logos
ExpenseAdvisor
- Managed/Native Integration
- Compiler Backend and Linker
- Performance
- Customer Engagements
- 64-bit Just In Time (JIT) Compilers
4Varigence
- Founded 2008
- Profitable since first month
- 5 employees
- Washington
- South Carolina
- Products
- Hadron
- Mist
- Vivid
- Council
5Evolution of Application Development
- Text editors (vim, emacs)
- Command Line Tools
- Text Debug (dbg, gdb)
- UI in binary resource files
- Visual designers for logic
- Object graph visualization
- No round-tripping code
- Human Editable Code Is the Primary Asset
- Visual tools enhance text
- Lots of deep analysis
Text Only
Visual Only
6Trends In Software Development
- Abstraction and Reuse
- 50 defects per 1000 lines of code
- Doesnt matter if its Assembly, COBOL, or C
- Framework-level services are key (garbage
collection, BCL, code security, etc.) - Reusable patterns and practices (PPG, MVC, PIMPL,
etc.) - Large Distributed Team Development
- How to develop in large distributed international
teams (Team System) - Agile development methodologies can increase
responsiveness to customer needs - Importance of assets, dev testing, unit testing,
integrated testing - Tools and Automation
- Documentation/visualization from code (XMLDoc,
Class Viewer, etc.) - Static Analysis capable of massive code quality
improvements (FXCop, etc.) - Dynamic code auditing capable of further code
quality improvements (SEH, RTC, CLR sec, etc.) - Code-Driven with Designer Support
- Rapid development and debug (keyboard,
intellisense, etc.) - Importance of consistent UI design
CHALLENGE Apply these trends to Business
Intelligence solutions!
790 minutes
- Problems and Opportunities
- What you can do today
- Release Management
- Issue Management
- Testing
- Unit Testing
- User Acceptance Testing
- Source Control
- Branch Management
- Continuous Integration
- Daily Builds
- Pre-Commit Checkin
- Performance Goals
- Implementation
- Futures and Varigence Products
8Release Management
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
9Release Management
- No real verification process
- Implied Criteria
- Feature complete
- Clean build
- Maybe clean run from scratch
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
10Release Management
- Verification cost shifted to users
- Issues handled reactively
- Limited to corners exposed during transition
- Implied Criteria
- Feature complete
- Clean build
- Maybe clean run from scratch
- Works for at least a subset of user scenarios
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
11Release Management
- Still the case that
- Verification cost shifted to users
- Issues handled reactively
- Limited to corners exposed during transition
- But now someone is on the hook
- Accountability not well aligned with ability and
responsibility to prevent and address issues - Doesnt mean translate to accountability in the
real world - Is that a new requirement or an existing one?
- Implied Criteria
- Feature complete
- Clean build
- Maybe clean run from scratch
- Works for at least a subset of user scenarios
- Someone is accountable if it breaks
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
12Release Management
- Still the case that
- Verification cost shifted to users
- Detail Issues handled reactively
- Details limited to transition corners
- Written list of important attributes of release
- Accountability is now aligned with ability and
responsibility to prevent and address issues - But only for most important criteria (top 10ish)
- Implied Criteria
- Feature complete
- Clean build
- Maybe clean run from scratch
- Works for at least a subset of user scenarios
- Someone is accountable if it breaks
- Shared understanding of release goals
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
13Release Management
- Expensive to implement, since
- Test plan must be created and reviewed
- Tests must be manually run after EVERY change to
a release candidate - Real tests that show passing and failing cases
- Can be run comprehensively and verified before
users ever see the solution - Note how the usage differs from specifications
- Implied Criteria
- Feature complete
- Clean build
- Maybe clean run from scratch
- Works for at least a subset of user scenarios
- Someone is accountable if it breaks
- Shared understanding of release goals
- Exhibits specified behaviors in tests
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
14Release Management
- Large upfront investment
- Unit tests
- Integration Tests
- User Acceptance Tests
- DO NOT USE AS A POLITICAL WEAPON
- If users find an issue not in the test suite
- Fix it and add it to the test suite
- It is everyones fault that its not in the test
suite - If there are personnel issues, it is orthogonal
- Implied Criteria
- Feature complete
- Clean build
- Maybe clean run from scratch
- Works for at least a subset of user scenarios
- Someone is accountable if it breaks
- Shared understanding of release goals
- Exhibits specified behaviors in tests
- Did not punt important issues for fear of cycle
time
Automated Signoff Test Suite
Signoff Test Plan
Signoff Criteria
Signoff process
Run in parallel with gradual transition
Flip the switch and pray
15Issue Management
- No work done unless it is in issue management
- Every check in associated with one (or a few
small) issues - If filing bugs feels like too much overhead, get
a different bug tracking system - Dont accumulate bug debt
- Err on the side of fixing bugs before writing new
features - If youre never going to fix something, Resolve
Wont Fix - Use bugs as a mechanism for locking down releases
- ZBB Zero Bug Bounce
- ZRBB Zero Resolved Bug Bounce
16Unit Testing
- There are several decent unit testing frameworks
for T-SQL and data warehouses - VSTSDB
- TSQLUnit
- utTSQL
- T.S.T.
- Pick one and stick with it
- Drive for 100 code coverage
- Remember that code coverage includes both your
actual code and the patterns in the incoming data
set
17User Acceptance Testing
- Excel
- Top 5 REAL Excel workbooks targeting your
solution - Using VSTO or VBScript, write automation to do
the following - Update connection to target test environment
- Refresh all data
- Add validation rules
- Hashes or value comparisons for historical values
that should not change - Thresholds for incremental data (e.g. Q2 Q1 /-
20) - Known relationships among data fields
- Presence/absence of dimension values
- Write output to convenient location for
triggering pass/fail - Every time you find/fix a bug, add a validation
rule - SSRS
- Export to Excel and do the same thing
- T-Sql
- Top 5 REAL analyst T-Sql queries
18Source Control
- Every check-in must have at least one issue in
tracking system and must be commented - Best practices
- Persist all SQL Queries and Stored Procedures in
separate files (primary storage) - Always use file connections for queries in DTSX
- Get a good text differencing tool
- File Formats
- SQL Scripts EASY
- DtProj EASY
- Sln Confusing Conflicts rare and small
- DtsConfig EASY
- Dtsx HARD BUT Use Grammar rules -gt MANAGEABLE
- CUBE/DIM HARD BUT Use Grammar rules -gt EASY
19Source Control Branches
- Branching creates a copy of the source code in an
isolated folder - Each branch is separately managed
- Branches can be merged and conflicts resolved
manually - Branches are used to
- Maintain development efforts for different lines
of business - Prevent changes in framework from destabilizing
integrations - Manage deployment to production and other
operational environments - Branching Walkthrough for TFS
20Branch and Publication Architecture
- All braches join at Main
- Production and UAT are deployed from their
respective branches - Change sets can only travel by being merged
across solid lines - Ensures that conflicts are kept to a minimum and
resolved early - Prevents changes from entering a node except from
a child or a parent (i.e. a node that is
explicitly related) - Goals
- Maintain known-state snapshots
- Minimize randomization
- Control change movement
Integrate Early and Often
21Patch/Diffs and Buddy Builds
- Environment plays an unusually large role in
correctness of BI solution code - Directory structure
- Connections and permissions
- Clean build vs. incremental
- To avoid breaks due to environment differences,
ask a friend to do a buddy build - Build, run, pass sniff tests
- Ideally, they should also do code reviews
- This is a talk topic in and of itself, but
everyone knows the basics - If just buddy building, why cant that be
automated? - It can be automated!
22Continuous Integration
- Keep builds clean and avoid errors early, so that
you can check in at least daily and integrate
branches almost as often - Many capabilities
- Nightly build and test passes
- Pre-commit checking replaces buddy build
- BUT not the code review
- Collect and report quality metrics
- Code coverage
- Data profiling and auditing
- Use continuous integration for development for
sure - Consider using it for deployed solutions as well
- Data auditing/profiling is a test pass on
production runs
23Performance Goals
- In most cases, on same hardware, at least 2x
performance is on the table right now - If you look for it, you will find it
- Query profiling
- Refactoring and precomputation
- Transition from set-based to hybrid set/row
processing - Set reasonable goals measure, measure, measure
- Performance goals arent about saving hardware
- They are about reducing cycle time and increasing
agility slow ETLs technical debt - Competing on the Basis of Speed
- http//video.google.com/videoplay?docid-510591045
2864283694
24Implementation
- Sounds great, what next?
- Words of caution
- Will be unsuccessful if trying to do everything
at once - Every deployment needs to be tweaked for the
needs and culture of the organization - Though worth it, this represents substantial cost
- Present a business case and report savings over
time - There will be resistance to cost and changing
habits
25Justification
- Too Expensive
- Create a business case
- At the very least, this is an excel spreadsheet
that you can send to management to tweak - Revisit the spreadsheet and demonstrate savings
- If it aint broke
- This is an optimization activity, not a fixing
activity for most - Demonstrate the win, and show how improving the
process reduces cost with a short-term ROI - Not the right time
- Recessions are the best time to invest in
cleaning house - When you start growing again, on what kind of
foundation? - Use numbers to demonstrate all of it!
26Hardware
- Rack / Lab / Data Center Environment
- Use existing servers with virtualization
- Average server utilization 12-40
- VMWare ESXi, Windows Server Hyper-V
- Free or easily found at steep discount
- HP DL360 G6
- 1U / 400 W (2,000 8,000 retail)
- Each machine can usually support many VMs
- Similar offerings from other vendors
- Under the desk
- Not recommended but better than nothing
- Get 2 quad-core 8GB desktops for lt500 each
- Retain 1 as a backup no matter how strong the
temptation
27Software
- Virtualization
- VMWare ESXi for free
- Partner program 4 instances of any software
(250) - Hyper-V Bundled with Windows Server 2008
- Source Control (http//en.wikipedia.org/wiki/Compa
rison_of_revision_control_software) - Team Foundation Server / Visual Source Safe
- Subversion / Tortoise (Free or hosted for
13.50/month) - Git / Mercurial (Distributed and Free)
- Issue Tracking (http//en.wikipedia.org/wiki/Compa
rison_of_issue_tracking_systems) - Team Foundation Server
- FogBugz 25/user/month
- 37Signals 50 / month
- Bugzilla Free (Various hosting options)
- Continuous Integration (http//en.wikipedia.org/wi
ki/Comparison_of_Continuous_Integration_Software) - Team Foundation Server
- TeamCity Free for up to 20 users and 20 build
configurations - Testing Harness
- Team Foundation Server
- NUnit (Free)
28Futures
- Multi-targeting
- Human editable / mergeable source code
- Modern Language
- Convention Over Configuration
- Dont Repeat Yourself
- Patterns and Practices
- Static Analysis and Dataflow Analysis
- Abstract Program Model
29(No Transcript)
30Demo
31Demo