亚洲国产精品成人精品,性做久久久久久,亚洲人成7777

Python extract all comments:提取所有comments,提取c/c++中注释Python脚本

luis — Mon, 03 Dec 2012 00:35:00 GMT

注意�Q�我们只是简单的提取// 以及/* */之间的内宏V�?br />如果 �E�序中出��C��“/*”,会有bug

#!/usr/bin/env python

import sys
import re

def comment_finder(text):
    pattern = re.compile( r'//.*?$|/\*.*?\*/', re.DOTALL | re.MULTILINE)
    result = pattern.findall(text)
    return result

def print_command(filename):

    codefile = open(filename,'r')
    commentfile = open(filename+".txt",'w')
    lines=codefile.read()
    codefile.close()
    #the list of comments
    list_of_comments = comment_finder(lines)
    for comment in list_of_comments:
        #print comment[0:2]
        if comment[0:2] == "//":
                comment_to_write = comment[2:]
        else:
            comment_to_write = comment[2:-2]
        if len(comment_to_write)!=0:
            commentfile.write(comment_to_write)
        commentfile.write('\n')
    commentfile.close()

if __name__ == "__main__":
    for filename in sys.argv[1:]:
        print_command(filename)

使用�Q?br />

在linux下面转到当前目录 ./get_comment.py *
或�?指定文�g�c�d��

./get_comment.py *.c

luis 2012-12-03 08:35 发表评论

luis — Thu, 08 Nov 2012 00:21:00 GMT

import math
def mianji(n,s):

temp=1/4*n*(s**2)/math.tan(math.pi/n)

return temp

print mianji(5,7)
============
使用时math.pi math.tan

luis 2012-11-08 08:21 发表评论

Python �W�记2 //

luis — Wed, 07 Nov 2012 22:43:00 GMT

Python “” ‘’ 都可以表�C�字�W�串

s1 = "hello,world"
如果要写成多行，那么��p��使用/ (“�q�行�W?#8221;)吧，�?nbsp;
s2 = "hello,/
world"
s2与s1是一��L��。如果你�?个双引号的话�Q�就可以直接写了�Q�如下：
s3 = """hello,
world,
hahaha."""�Q�那么s3实际上就�?hello,/nworld,/nhahaha.", 注意“/n”

s5 = "Let's go"
s4 = 'Let/'s go'

我们也可以把''' ''' 作�ؓ多行注释

str(object) 可以��所有�{化�ؓ字符丌Ӏ?br />

python	java	描述
or	\|\|	逻辑�?/td>
and	&&	逻辑�?/td>
not	�Q?/td>	逻辑�?/td>
<�Q?gt;�Q?lt;=�Q?gt;=�Q?=�Q?=�?lt;>	<�Q?gt;�Q?lt;=�Q?gt;=�Q?=�Q?=	比较操作
is�Q�is not	instanceof	�w�䆾认证
\|	\|	位或
&	&	位与
^	^	位异�?/td>
<<�Q?gt;>	<<�Q?gt;>	�U�M��
+�Q?�Q?�Q?	+�Q?�Q?�Q?	加减乘除
%	%	余数
~	~	位取�?/td>

//�q�算�W?
10/3==3
120//10==12
121//10==12
122//10==12
130//10==13
10//3.0==3.0

A new operator, //, is the floor division operator. (Yes, we know it
looks like C++'s comment symbol.) // always performs floor division no
matter what the types of its operands are, so 1 // 2 is 0 and 1.0 //
2.0 is also 0.0.

not ()

luis 2012-11-08 06:43 发表评论

label switching

luis — Tue, 30 Oct 2012 19:36:00 GMT

label switching:
比如有两��p��果： �W�一个箱子内有两个苹果，label 为a 的概率�ؓ30%,为b的概�?0%�Q�第二个��子内有四个�Ҏ��Q�label为b的概�?0%,label 为a的概�?0%.
如果我们求所有的�Ҏ��的重量，只需要将所有的��子内的�Ҏ��取出来求重量卛_��?br />但是我们先求label a的箱子苹果的重量�Q�加上label b的箱子苹果的重量�Q�可能出��C��ơ取的是同一个箱子，�q�就是label switching问题�?img src ="http://m.shnenglu.com/luyulaile/aggbug/194109.html" width = "1" height = "1" />

luis 2012-10-31 03:36 发表评论

Python generate corpus using Dirichlet distribution

luis — Sun, 28 Oct 2012 02:13:00 GMT

At first, let's define the sample function:

def sample(dist, num_samples=1):
    """
    Uses the inverse CDF method to return samples drawn from an
    (unnormalized) discrete distribution.

    Arguments:

    dist -- (unnormalized) distribution

    Keyword arguments:

    num_samples -- number of samples to draw
    """

    cdf = cumsum(dist)
    r = uniform(size=num_samples) * cdf[-1]

    return cdf.searchsorted(r)

As we can see, the sample function input two parameters, one is dist, which can be an un-normalized distribution, another is the sample we want to draw.

Let's see how to generate corpus for Dirichlet--multinomial unigram language model

def generate_corpus(beta, mean, N):
    """
    Returns a corpus of tokens drawn from a Dirichlet--multinomial
    unigram language model. Each token is an instance of one of V
    unique word types, represented by indices 0, , V - 1.

    Arguments:

    beta -- concentration parameter for the Dirichlet prior
    mean -- V-dimensional mean of the Dirichlet prior
    N -- number of tokens to generate
    """

    pass # YOUR CODE GOES HERE
    #print mean
    #print beta
    #print dot(mean,beta)
    #print dirichlet(mean*beta,size=1)
    temp=sample(dirichlet(beta*array(mean),size=1),N)
    #print temp
    return temp

please keep in mind the dirichlet function is “from numpy.random.mtrand import dirichlet"
and the parameters it receives are corresponding to beta*array(mean). beta is the concentration factor, and mean is the vector which sum to 1.

another way is to generate corpus is using the property:
P(D'|D,H)= Nv+beta_nv/N+beta

def generate_corpus_collapsed(beta, mean, N):
    """
    Returns a corpus of tokens drawn from a Dirichlet--multinomial
    unigram language model using the 'collapsed' generative process
    (i.e., phi is not explicitly represented). Each token is an
    instance of one of V unique word types.

    Arguments:

    beta -- concentration parameter for the Dirichlet prior
    mean -- V-dimensional mean of the Dirichlet prior
    N -- number of tokens to generate
    """

    V = len(mean) # vocabulary size

    corpus = zeros(N, dtype=int) # corpus

    Nv = zeros(V, dtype=int) # counts for each word type

    pass # YOUR CODE GOES HERE
    for n in xrange(N):
        corpus[n]=sample((Nv+beta*array(mean))/(n+beta),1)
        Nv[corpus[n]]+=1;
    return corpus

Let's see how to generate corpus for Mixture of Dirichlet-multinomial unigram language model

def generate_corpus(alpha, m, beta, n, D, Nd):
    """
    Returns a grouped corpus drawn from a mixture of
    Dirichlet--multinomial unigram language models.

    Arguments:

    alpha -- concentration parameter for the Dirichlet prior over theta
    m -- T-dimensional mean of the Dirichlet prior over theta
    beta -- concentration parameter for the Dirichlet prior over phis
    n -- V-dimensional mean of the Dirichlet prior over phis
    D -- number of documents to generate
    Nd -- number of tokens to generate per document
    """
    corpus = GroupedCorpus()

    pass # YOUR CODE GOES HERE
    #determine the topic the distribution for topic dirichlet(dot(m,alpha),size=1)
    #given the topic, the distribtuion for word dirichlet(dot(n,beta),size=1)
    theta=dirichlet(alpha*array(m),1)
    phis=dirichlet(beta*array(n),len(m))
    for d in range(0,D):
        [t]=sample(theta,1)
        #print groupVcab
        corpus.add(str(d),str(t),[str(x) for x in sample(phis[t,:],Nd)])
    return corpus

注意是T个topic (group)�Q?span style="font-size: 13px; background-color: #eeeeee; "> phis=dirichlet(beta*array(n),len(m)) 产生了T�?dirichlet distribution,相同的topic t应该取同一�?dirichlet distribution phis[t,:]

luis 2012-10-28 10:13 发表评论

luis — Wed, 19 Sep 2012 01:47:00 GMT

Python array 用法
直接 result=[]

for x in range(0,N):

temp=beta(b,n)

print temp

if temp >= n:

result.append("Yes") #直接append

else:

result.append("No") #直接append

return result

luis 2012-09-19 09:47 发表评论

Python�W�记

luis — Sun, 09 Sep 2012 06:47:00 GMT

Tutorial �Q?/span> http://www.tutorialspoint.com/python/python_files_io.htm

Python IO
输出 print

str = raw_input("Enter your input: "); print "Received input is : ", str

Python只有三种变量�c�d�� int, string, float?
typeof(1.5)
貌似不支持隐式类型�{�?br />print str(2.5) �?br />print 2.5 �?br />print '2.5' �?br />
python 定义�Ҏ��?br />def MethodName(para,para2): #注意�q�里的冒�?br /> if

python注释
#单行注释
""" 三个双引��h��多行注释 """

python 引用
include math

Python的str
str(var)�c�d��转换
len(var)
var.upper()
var.lower()
var[2] �W�三个（注意下标�?开始）元素�Q�类��g��list
var[:3] 前三个元素，实际上指的是0截止�?-1的元�?br />var[2:4]下标�?�?-1的所有元�?br />
Python的list
exampe=[a,b,c,d,e,f];
len(exampe)
自带sort�Ҏ��

Python的dictionary
key -value对应
value可以是一个list
注意�Ҏ��号[]里面只能使用key
区分 del dict['Name']

del dict['Name']; # remove entry with key 'Name' dict.clear();     # remove all entries in dict del dict ;        # delete entire dictionary

自带的方法：

1	cmp(dict1, dict2) Compares elements of both dict.
2	len(dict) Gives the total length of the dictionary. This would be equal to the number of items in the dictionary.
3	str(dict) Produces a printable string representation of a dictionary
4	type(variable) Returns the type of the passed variable. If passed variable is dictionary then it would return a dictionary type.

Python�cȝ��定义和类�Ҏ��的定�?br />定义�c�M��需要用def class,直接

class ClassName(object):
每个�c�都�?__init__(self,arg):
�Ҏ��Q�注意是左右各两个下划线�Q��d��4根下划线

�c�L��法都需要包含self�q�个参数�Q�但是��用的时候不需要self,见下�?br />
例如

class Adder(object):

def __init__(self):

self.baseNum = 2

def prnt_num(self):

print self.baseNum

def add_to_base(self, arg):

# Your code here

self.baseNum+=arg

print self.baseNum

objectVar = Adder()

objectVar.prnt_num()

# Your code here

objectVar.add_to_base(3)

Python中的�c�d��量不�?self.xxx来引用，但是成员变量可以
Class variables are special because they belong to the class; the objects created do not get their own copies of the class variable. Class variables are accessed using the class name and dot notation.
ClassName.classVar
Class variables are created outside of__init__
例如�Q?br />

class Widget(object):

objID = 0

def __init__(self):

Widget.objID += 1

# Your code here

self.myID=Widget.objID

常犯错误�Q�indentation is very important
python indentation error expected an indented block
�q�有一个错误就�?�c�L��法，必须使用 self参数�Q�即使没有参敎ͼ��Q?/span>
另外一个常错的地方��是 __init__(self,arg) 一定是四根下划�U?/span>

不仅要记得留dent �q�要记得 �~�进

luis 2012-09-09 14:47 发表评论