【POJ-1200】解题报告(字符串hash,进制)

原始题目

  • Time Limit: 1000MS
  • Memory Limit: 65536K
  • Total Submissions: 32268
  • Accepted: 8900

Description

Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.

Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.

As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5. ### Input

The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed 16 Millions.

Output

The program should output just an integer corresponding to the number of different substrings of size N found in the given text.

Sample Input

3 4
daababac

Sample Output

5

Hint

Huge input,scanf is recommended.

Source

Southwestern Europe 2002

题目大意

  • 最多由nc种字符构成的原字符串,求长度为n的子串共有多少种。 # 解题思路
  • 字符串hash
  • 由于最多不超过nc种字符,把每种字符映射到1到nc,字串即可看作类似nc进制的数(不含0)
  • 用set或者vis数组维护一下记录不同数的个数即可。 # 解题代码
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    //#include <bits/stdc++.h>
    #include <cstdio>
    #include <cmath>
    #include <cstdlib>
    #include <cstring>
    #include <iostream>
    #include <set>
    #include <iomanip>
    #include <algorithm>
    #include <queue>
    #include <map>
    #include <string>
    #define INF 0x3f3f3f3f
    #define rep(i,a,n) for(int i=a;i<n;i++)
    #define per(i,a,n) for(int i=n-1;i>=a;i--)
    #define ms(x,a) memset((x),(a),sizeof(a))
    using namespace std;
    const int maxn=1e5+5;
    #define PI acos(-1.0)

    typedef long long ll;
    typedef unsigned long long ull;

    int letter[300];
    char s[1000005];
    bool myhash[20000005];
    int n,nc;
    int main(){
    ios::sync_with_stdio(false);
    while(cin>>n>>nc>>s){
    memset(letter,0,sizeof(letter));
    memset(myhash,0,sizeof(myhash));
    int len=strlen(s);
    int cnt=0;
    rep(i,0,len){
    if(!letter[s[i]]) {
    letter[s[i]]=++cnt;
    if(cnt==nc) break;
    }
    }
    int ans=0;
    rep(i,0,len-n+1){
    int sum=0;
    for(int j=i;j<i+n;j++){
    sum=sum*nc+letter[s[j]];
    }
    if(!myhash[sum]){
    // cout<<"i="<<i<<endl;
    ans++;
    myhash[sum]=1;
    }
    else continue;
    }
    cout<<ans<<endl;
    }
    }

收获与反思

  • 简单字符串hash,没有涉及取模操作